Re: Deployment question: is Jackrabbit process-safe? [SEC=UNCLASSIFIED]

2012-09-03 Thread Ross . Dyson
the first jackrabbit to start up the repository locks the repository by 
creating a lock file.

Ross



From:   Esmond Pitt esmond.p...@bigpond.com
To: users@jackrabbit.apache.org
Date:   04/09/2012 10:04 AM
Subject:Deployment question: is Jackrabbit process-safe?



I have a clustered web-app that uses Jackrabbit. I presently have 
Jackrabbit
in yet another Tomcat and am using RMI to communicate with it. However it
occurs to me that I could just include the JackRabbit API jars in the 
webapp
and call it directly, *provided* multiple copies of the Jackrabbit API 
will
co-operate correctly on the database. Is this the case?
 
EJP


--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: Content repository not initializing [SEC=UNCLASSIFIED]

2012-07-24 Thread Ross . Dyson
Dude

You are using an unsupported version of jackrabbit, you dont describe your 
deployment, is this a problem with an existing repository, can it be 
reproduced with a new repository
you need to supply some details!

Ross



From:   AtulMaurya atulmaury...@gmail.com
To: users@jackrabbit.apache.org
Date:   23/07/2012 05:06 PM
Subject:Re: Content repository not initializing



Hello everyone, 

I am new to Jackrabbit and badly in need of help on the issue above. 
please
suggest me something if you have some idea. I have tried all the work 
around
suggested on different forums for this issue.

Thanks a ton in advance.



--
View this message in context: 
http://jackrabbit.510166.n4.nabble.com/Content-repository-not-initializing-tp4656033p4656084.html

Sent from the Jackrabbit - Users mailing list archive at Nabble.com.


--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


RE: Error when install jackrabbit-bundle into OSGi container [SEC=UNCLASSIFIED]

2012-07-24 Thread Ross . Dyson
I don't see anywhere where it is claimed that Jackrabbit is ready for OSGi
http://wiki.apache.org/jackrabbit/JackrabbitOsgi is from 2010





From:   XiLai Dai xl...@talend.com
To: users@jackrabbit.apache.org users@jackrabbit.apache.org
Date:   25/07/2012 12:47 PM
Subject:RE: Error when install jackrabbit-bundle into OSGi 
container



Anyone here? If no one give an answer, I will think it as a bug and log a 
jira issue.

Xilai
-Original Message-
From: XiLai Dai [mailto:xl...@talend.com] 
Sent: Friday, July 20, 2012 3:28 PM
To: users@jackrabbit.apache.org
Subject: Error when install jackrabbit-bundle into OSGi container

Hello,

We want to use jackrabbit in the OSGi container (Karaf).  The exception 
thrown when installed the all-in-one jackrabbit-bundle:

 install -s mvn:org.apache.jackrabbit/jackrabbit-bundle/2.4.2
org.osgi.framework.BundleException: The bundle 
org.apache.jackrabbit.jackrabbit-bundle_2.4.2 [219] could not
be resolved. Reason: Missing Constraint: Import-Package: com.ibm.db2.jcc; 
version=0.0.0

why com.ibm.db2.jcc is a mandatory import package?

Thanks in advance
Xilai


--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: failed to store property state [SEC=UNCLASSIFIED]

2012-04-02 Thread Ross . Dyson
my repository is 1.2TB



From:   ramesh_meenava...@satyam.com ramesh_meenava...@satyam.com
To: users@jackrabbit.apache.org
Date:   03/04/2012 12:58 PM
Subject:Re: failed to store property state



The issue is not with Diskspace, could be something else like concurrent
thread or more data (currently my repository size reached to 25 GB)

--
View this message in context: 
http://jackrabbit.510166.n4.nabble.com/failed-to-store-property-state-tp4525301p4527687.html

Sent from the Jackrabbit - Users mailing list archive at Nabble.com.


--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


RE: Jackrabbit 2.2.5 - loss of data [SEC=UNCLASSIFIED]

2011-12-22 Thread Ross . Dyson
can you put a println in the jackrabbit code and confirm the expected path 
when the exception is thrown?



From:   Shah, Sumit (CGI Federal) sumit.s...@cgifederal.com
To: users@jackrabbit.apache.org users@jackrabbit.apache.org
Date:   21/12/2011 02:45 AM
Subject:RE: Jackrabbit 2.2.5 - loss of data [SEC=UNCLASSIFIED]



Thanks Ross. It seems like the content is present on the filesystem. I can 
see the old documents in the repository/datastore folders. But the link 
between the Jackrabbit metadata (ex: path) and the content seems to be 
broken. Any reason on why this would happen?

Does Jackrabbit use UUIDs internally to store the metadata and the content 
itself?

Thanks
Sumit

From: ross.dy...@ipaustralia.gov.au [mailto:ross.dy...@ipaustralia.gov.au]
Sent: Monday, December 19, 2011 9:20 PM
To: users@jackrabbit.apache.org
Cc: users@jackrabbit.apache.org
Subject: Re: Jackrabbit 2.2.5 - loss of data [SEC=UNCLASSIFIED]

This looks suspiciously like a problem I have had before, where somebody 
writes a script to delete files that look like temp files, no file 
extensions, over a month old.  I had one that was deleting classes created 
at runtime, so each morning there was a good chance of getting classloader 
errors.

Best of luck.



From:Shah, Sumit (CGI Federal) sumit.s...@cgifederal.com
To:users@jackrabbit.apache.org users@jackrabbit.apache.org
Date:20/12/2011 11:58 AM
Subject:Jackrabbit 2.2.5 - loss of data




Hi All,

I am running into a serious issue. It seems like I am unable to retrieve 
documents from Jackrabbit that are more than a month old. I get the 
following error:

JCR Action 'Get stream' cannot be performed because the provided path 
does not exist

I am running Jackrabbit in standalone mode and also in a clustered 
environment. I am seeing the same issue on both. When does this happen? Is 
there a self initiated process that cleans up the data within Jackrabbit? 
What are the possible resolutions to this?

I would appreciate any help on this.

Thanks
Sumit


--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: Jackrabbit 2.2.5 - loss of data [SEC=UNCLASSIFIED]

2011-12-19 Thread Ross . Dyson
This looks suspiciously like a problem I have had before, where somebody 
writes a script to delete files that look like temp files, no file 
extensions, over a month old.  I had one that was deleting classes created 
at runtime, so each morning there was a good chance of getting classloader 
errors.

Best of luck.



From:   Shah, Sumit (CGI Federal) sumit.s...@cgifederal.com
To: users@jackrabbit.apache.org users@jackrabbit.apache.org
Date:   20/12/2011 11:58 AM
Subject:Jackrabbit 2.2.5 - loss of data



Hi All,

I am running into a serious issue. It seems like I am unable to retrieve 
documents from Jackrabbit that are more than a month old. I get the 
following error:

JCR Action 'Get stream' cannot be performed because the provided path 
does not exist

I am running Jackrabbit in standalone mode and also in a clustered 
environment. I am seeing the same issue on both. When does this happen? Is 
there a self initiated process that cleans up the data within Jackrabbit? 
What are the possible resolutions to this?

I would appreciate any help on this.

Thanks
Sumit


--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: Maximum number of workspaces each repository can have?? [SEC=UNCLASSIFIED]

2011-12-06 Thread Ross . Dyson
My application has about 1.5 TB,  6 or 7 million files and associated 
metadata.

Its a publication style system, 1 thread updates and everyone else reads.

Ross.



From:   Bertrand Delacretaz bdelacre...@apache.org
To: users@jackrabbit.apache.org
Date:   07/12/2011 07:06 AM
Subject:Re: Maximum number of workspaces each repository can 
have??



Hi,

On Tue, Dec 6, 2011 at 6:05 PM, Vishwanath Dubey vnd...@gmail.com wrote:
 My application is ticket based tool where one ticket corresponds to one
 workspace so we did not see any limitation on number of workspace
 unless there is sufficient space on hard drive

Note that having one workspace per per unit of content (ticket in your
case) is very unusual - if it works for you all the better, but it's
much more common to use paths to separate such things. That usually
works much better with tools like JCR explorers, Sling, etc.

David's model [1] rule #3 has more details.

-Bertrand

[1] http://wiki.apache.org/jackrabbit/DavidsModel


--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: Indexing Binaries JackRabbit webapp-2.2.10 [SEC=UNCLASSIFIED]

2011-11-27 Thread Ross . Dyson
I believe I got the same message in the logs when I was indexing PDF 
documents that had no text component, ie they were made from image files 
only.  Is this possible in your case?

Ross.



From:   Stephanp stephan.hackst...@googlemail.com
To: users@jackrabbit.apache.org
Date:   28/11/2011 06:23 AM
Subject:Re: Indexing Binaries JackRabbit webapp-2.2.10



Additionally I tested everything against jackrabbit-webapp-2.1.6. With 
this
release it works fine. I suppose there could be an dependency error in
release 2.2.10.

regards,
stephan

--
View this message in context: 
http://jackrabbit.510166.n4.nabble.com/Indexing-Binaries-JackRabbit-webapp-2-2-10-tp4106625p4113125.html

Sent from the Jackrabbit - Users mailing list archive at Nabble.com.


--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: prop.setProperty(jcr:data, Bianary) [SEC=UNCLASSIFIED]

2011-04-11 Thread Ross . Dyson
I changed my code to use value factory:

FileInputStream in = null;


try {
in = new FileInputStream(fileName);

binary = 
session.getValueFactory().createBinary(in);

node.setProperty(jcr:data, binary);
 



From:   bokie jms.cer...@gmail.com
To: users@jackrabbit.apache.org
Date:   10/04/2011 05:02 AM
Subject:prop.setProperty(jcr:data, Bianary)



Hi,

I am currently playing around with Jackrabbit 2.2.5 and my test is 
throwing
the exception shown below when calling: 
  node.setProperty(jcr:data, new BinaryValue(new
FileInputStream(file)).getBinary());
but is working fine when calling:
  node.setProperty(jcr:data, new BinaryValue(new
FileInputStream(file)).getStream());

NOTE:
Node.setProperty(String, InputStream) is marked as deprecated.

### Exception start #
org.apache.jackrabbit.rmi.client.RemoteRepositoryException:
java.rmi.MarshalException: error marshalling arguments; nested exception 
is: 
 java.io.NotSerializableException: 
org.apache.jackrabbit.value.BinaryImpl
 at
org.apache.jackrabbit.rmi.client.ClientNode.setProperty(ClientNode.java:134)
 at
org.apache.jackrabbit.rmi.client.ClientNode.setProperty(ClientNode.java:236)
 at jmdsc.jackrabbit.Main.addFileNode(Main.java:72)
 at jmdsc.jackrabbit.Main.test1(Main.java:38)
 at jmdsc.jackrabbit.Main.main(Main.java:21)
Caused by: java.rmi.MarshalException: error marshalling arguments; nested
exception is: 
 java.io.NotSerializableException: 
org.apache.jackrabbit.value.BinaryImpl
 at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:138)
 at 
org.apache.jackrabbit.rmi.server.ServerNode_Stub.setProperty(Unknown
Source)
 at
org.apache.jackrabbit.rmi.client.ClientNode.setProperty(ClientNode.java:129)
 ... 4 more
Caused by: java.io.NotSerializableException:
org.apache.jackrabbit.value.BinaryImpl
 at 
java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1164)
 at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1518)
 at 
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1483)
 at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1400)
 at 
java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1158)
 at 
java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:330)
 at 
sun.rmi.server.UnicastRef.marshalValue(UnicastRef.java:274)
 at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:133)
### Exception end #


--
View this message in context: 
http://jackrabbit.510166.n4.nabble.com/prop-setProperty-jcr-data-Bianary-tp3438835p3438835.html

Sent from the Jackrabbit - Users mailing list archive at Nabble.com.


--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: jackrabbit, lucene, tika ... and pdfbox [SEC=UNCLASSIFIED]

2011-03-08 Thread Ross . Dyson
It wasn't OK for you to add a indexing_configuration.xml and exclude the 
indexing of the binary data?

I tried that and decreased the re-indexing time to around 25%, and 
presumably much smaller index files too.  I left in the attributes that I 
may sometimes want to search on, eg unique keys, document title.

Ross.



From:   Kevin Jansz kevin.ja...@exari.com
To: users@jackrabbit.apache.org
Date:   09/03/2011 02:52 PM
Subject:jackrabbit, lucene, tika ... and pdfbox



It's been discussed on this list before but I'm summarising my latest
issues/findings ...

Our use of jackrabbit is for content storage without the built-in
search/querying mechanism. It's possible to leave out the
SearchIndex definition in the configuration but you're effectively
breaking the weak reference handling (used by user-management) -
non-critical and the repository *seems* to work without it despite
logging warnings. But I feel it's better to leave the SearchIndex,
therefore querying in ... so:

Weak-references
- requires SearchIndex / querying
- requires lucene (for now, there's no simple alternative)
- requires tika (core)
- requires various other format handling libraries for
different parser implementations

In jackrabbit 2.1.x if you want custom parsers - or in my case no
parsers and the associated overhead and library dependence - you can't
easily do this as the jackrabbit-core jar includes a tika-config.xml
and loads this explicitly (from
org\apache\jackrabbit\core\query\lucene\tika-config.xml). The only
work-around is to replace this file in the jar file - not ideal.

It's raised in jiras JCR-2642 ( then TIKA-317) that making (very
sensible) use of the jar file Service Provider mechanism could
simply things. Drop in a jar file into the classpath that defines
parsers and this gets used ... my reading of this was that to get no
parsers we'd simply leave out tika-parsers-0.8.jar from the classpath.
It also made sense that the jackrabbit-core may still include a
tika-config.xml to a) use DefaultParser b) explicitly disable zip and
image extraction. Unfortunately, on upgrading to 2.2.4 errors about
missing pdfbox libraries (when storing PDF content) led me to this in
tika-config.xml (in the jackrabbit-core jar file):
parser class=org.apache.jackrabbit.core.query.pdf.PDFParser
  !-- JCR-2838: Override the faulty PDF parser in Tika 0.8 --
  mimeapplication/pdf/mime
/parser

Looking at jiras JCR-2838 ( then TIKA-548) it's clear there's a
problem. I'm not entirely sure why the work around is in
jackrabbit-core. I would have though putting this in a
x-parsers-2.2.4.jar with a META-INF/services/... definition would
have been the correct way to handle this? To avoid issues of
parser/service-provider precedence? Perhaps a separate jar-build for
this issue would be overkill for a point release?

It's not a huge issue I guess as it seems with tika 0.9 (or 0.8.1?)
the PDF parser issue will be resolved in which case I expect the code
in org.apache.jackrabbit.core.query.pdf.* will disappear along with
reference to it from the tika-config.xml. In the mean time we're back
to having to replace
org\apache\jackrabbit\core\query\lucene\tika-config.xml in the
jackrabbit-core to avoid custom parsers (and errors about their
dependencies). I'm taking the time to mention it here in case it saves
someone time and also to gauge if our view of lucene, tika and the
parsers is incorrect - that future releases of jackrabbit may still
include parsers other than DefaultParser and EmptyParser in it's
tika-config.xml.

Regards,
Kevin

--
Kevin Jansz
kevin.ja...@exari.com
Level 7, 10-16 Queen Street, Melbourne 3000 Australia
Tel +61 3 9621 2773 | Fax +61 3 9621 2776
Exari Systems
Boston | London | Melbourne | Munich
www.exari.com

Test drive our software online - www.exari.com/demo-trial.html
Read our blog on document assembly - blog.exari.com


--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Dont want to index the content of my documents [SEC=UNCLASSIFIED]

2011-02-14 Thread Ross . Dyson
Hi

All the search criteria I require are declared as properties of my node 
type.  Indexing the content is unnecessary, and rebuilding the indexes 
when they appear corrupted must take longer.

I have read the wiki page about indexing_configuration.xml, if it is badly 
configured I get an error but after rebuilding the indexes I can still 
find based on file content.

I added this:
 index-rule nodeType=jcr:content

  /index-rule

thinking all the jcr:content properties would be excluded, but that doesnt 
seem to have made any difference.


Thanks

Ross.
--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Rebuilding the lucene index [SEC=UNCLASSIFIED]

2010-08-30 Thread Ross . Dyson
Hi all

After having some problems with Lucene trying to create more than 100,000 
subdirs under the index directory, and trying to rebuild the indexes but 
then having troubles that workspace.xml was deleted and blah blah, (try to 
avoid doing anything under pressure) I thought I would have a practise on 
my development machine.

I am using the 2.1 rar file deployed on weblogic 10.  I am using a very 
default installation, without clustering.

I stopped the server, deleted the workspaces/default/index directory and 
restarted the server.  The re-index occurs in the startup thread, so my 
weblogic has not restarted yet.

My production environment has many documents, and several systems deployed 
per server.  There will be trouble if I try to stop those other systems 
for a couple days while I re-index.

1) Am I likely to get the problem of 100K subdirs blowing up my system, 
when I get my test data reinstated (loader runs for a week)
2) Is there a way to re-index that will not cost me some body parts?

Thanks


Ross.
--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: Is there a way to store JackRabbit documents in two different datastores for one repository and yet index them with Lucene [SEC=UNCLASSIFIED]

2010-07-20 Thread Ross . Dyson
Isn't this the kind of thing that the JBOSS JCR implementation is supposed 
to handle?


Ross.


Bénigot Yves yves.beni...@ginerativ.fr wrote on 20/07/2010 02:14:46 AM:

 From: Bénigot Yves yves.beni...@ginerativ.fr
 To: users@jackrabbit.apache.org
 Date: 20/07/2010 02:15 AM
 Subject: Is there a way to store JackRabbit documents in two 
 different datastores for one repository and yet index them with Lucene
 
 I have two categories of documents in one repository :
 
 -  some files are stored by the users, they would go in a file 
datastore
 
 -  other files are generated, and already stored in an SQL table
 in a BLOB column
 
 
 
 I would like to create a specific DataStore or PersistenceManager to
 be able to leave the files stored in the SQL table where they are, 
 
 yet define them as JackRabbit documents, and let Lucene index them
 
 
 
 Is it possible to have such a hybrid setup with two different 
 storage seen as one repository in JackRabbit ?
 
 
 
 yves
 

--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: Jackrabbit Deployment [SEC=UNCLASSIFIED]

2010-07-18 Thread Ross . Dyson
Split the update components from the back end service and offer them as 
web services from the web app part.

The back end service becomes a client.  Use a shared directory, pass paths 
to files to save encoding them into soap messages.


Ross.

Jawad Bokhari jawad.bokh...@gmail.com wrote on 17/07/2010 07:10:54 PM:

 From: Jawad Bokhari jawad.bokh...@gmail.com
 To: users@jackrabbit.apache.org
 Date: 17/07/2010 07:11 PM
 Subject: Jackrabbit Deployment
 
 Hi,
 
 I want to setup single Jackrabbit repository that should be accessible 
from
 web for for viewing documents(ready-only) and from a back-end service 
that
 populates the repository(read and write) with documents on scheduled 
basis.
 
 Both, the web application and the back-end service lies on the same 
machine.
 I am using File-based repository at the moment.
 Currently, I can't run both simultaneously. If i start the back-end 
service,
 it applies lock on the repository and the web-app can't access. So, i 
always
 have to stop the back-end service to let web-application show the 
documents
 archived even in just read-only mode.
 
 I know, there is a deployment model based on Repository Server, but 
that
 would be too complex to setup. Is there any simpler mechanism to achieve 
my
 desired  model?
 
 Many thanks,
 
 Jawad

--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: Not returning any result when I am doing text search from added file from jackrabbit [SEC=UNCLASSIFIED]

2010-05-12 Thread Ross . Dyson
I use the inputstream from the file to create an instance of 
javax.jcr.Binary, and set that, not just the inputstream.  I dont know if 
the code that reads the text contents uses the mime type to know how to 
read the data, but I would say it probably does.

Node file = appNode.addNode(documentId, nt:file);
Node resource = file.addNode(jcr:content, 
ed:docreftype);
InputStream in = new FileInputStream(fileName);
 
Binary binary = 
session.getValueFactory().createBinary(in);
binary.dispose();

resource.setProperty(jcr:data, binary);
String mimeType = null;
if (fileName.endsWith(.txt)) {
mimeType = text/plain;
}
else if (fileName.endsWith(.xls)) {
mimeType = application/vnd.ms-excel;
}
else if (fileName.endsWith(.ppt)) {
mimeType = application/mspowerpoint;
}
else {
mimeType = application/octet-stream;
}

resource.setProperty(jcr:mimeType, mimeType);

Ross

Jenni Pothu jen...@virtusa.com wrote on 12/05/2010 09:23:49 PM:

 From: Jenni Pothu jen...@virtusa.com
 To: users@jackrabbit.apache.org
 Date: 12/05/2010 09:24 PM
 Subject: Not returning any result when I am doing text search from 
 added file from jackrabbit
 
 Hi All,
 
 I am attaching the file to blogEntry node and blogEntry
 node is added into repository. Now I want to search the content from the
 added file.
 
 
 
  I have node like blogEntry and I am adding file to that node like
 below
 
 //getting the
 blogEntryNode form method getBlogEntryNode(blogTitle, session);
 
 Node blogEntryNode = getBlogEntryNode(blogTitle, session);
 
 Node NewblogEntry = blogEntryNode.addNode(NewblogEntry, nt:file);
 
   Node resNode = NewblogEntry.addNode(jcr:content,
 nt:resource);
 
   resNode.setProperty(jcr:data, file);
 
 
 
 here file is of type Inputstream.
 
 
 
 When I am searching with the  query like Query query =
 queryManager.createQuery
 
   (//blogEntry/NewblogEntry(*, nt:file)[jcr:contains(.,
 '+sometext+')],Query.XPATH);
 
 Here sometext is any content which I want to search in the file.
 
 
 
 My query not returning any result. Is my query correct. Please correct
 me if anything wrong in the code. Thanks in advance. 
 
 
 
 Thanks,
 
 Jenni
 
 
 

 
 This message, including any attachments, contains confidential 
information 
 intended for a specific individual and purpose, and is intended for 
 the addressee only. Any unauthorized disclosure, use, dissemination,
 copying, or distribution of 
 this message or any of its attachments or the information contained 
 in this e-mail, or the taking of any action based on it, is strictly
 prohibited. If you are not the intended recipient, please notify the
 sender immediately by return e-mail and delete this message.
 
 


--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: Can we add text files to jackrabbit repository [SEC=UNCLASSIFIED]

2010-05-05 Thread Ross . Dyson
Jenni

I think you misunderstand what happens when you put stuff in the 
repository.

Nodes are stored as data, not as obvious things (file representations of 
the data)  in the file system. 

Even when you have the configuration to store files in the file system, 
they do not magically appear as your files in the file system.  They are 
there, but the names are changed to enhance the storage of the data.

When I started I made a simple war file and used the default jackrabbit 
search.jsp to find the nodes I had added, and then added a page that 
starts with items in the root of the default workspace and allows you to 
walk down the tree, listing the nodes and their attributes.

I hope this helps to get you started.

Ross.



From:   Jenni P jenni@gmail.com
To: users@jackrabbit.apache.org
Date:   05/05/2010 10:56 PM
Subject:Can we add text files to jackrabbit repository



Hi All,
  I am new to the Jackrabbit repository. I want to add the text file 
to
the jackrabbit repository. can we add text files to the repository. I 
tried
the example to add the nodes to the repository. But how to check whether
nodes are added or not in repository. where to check the nodes in my
machine. please guide me. Thanks in advance.

Thanks,
jenni


--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


RE: Getting noSuchMethodError in jackrabbit session.login() method [SEC=UNCLASSIFIED]

2010-05-04 Thread Ross . Dyson
Jenni

I think you need to look again at your classpath and the instructions in 
the First Hops example.

The only jar file you need in your path is 
jackrabbit-standalone-2.x.x.jar, it contains unjarred classes from all of 
the dependent libraries.


Give that a shot and see how it goes.


Ross.


--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: Sane file hierarchy when persisting file/folder nodes [SEC=UNCLASSIFIED]

2010-04-05 Thread Ross . Dyson
You can mount the JCR repository using the Webdav interface, and so you 
get a view of the repository as if it were all just files in folders.

I don't believe the files are stored in binary, they are really just plain 
text.

Ross.



Tim Terlegård tim.terleg...@gmail.com wrote on 01/04/2010 07:09:21 PM:

 From: Tim Terlegård tim.terleg...@gmail.com
 To: users@jackrabbit.apache.org
 Date: 01/04/2010 07:10 PM
 Subject: Sane file hierarchy when persisting file/folder nodes
 
 I'm investigating whether using Jackrabbit/JCR is a good abstraction
 when dealing with files.
 
 I created a workspace (myspace) and used BundleFsPersistenceManager.
 When I tried the code below it created the file
 repository/workspaces/myspace/items/54/d6/c2bc4964477ab060d399ea532.n
 I'd rather it create
 repository/workspaces/myspace/items/hi.txt
 
 Is it possible to get a nice file hierarchy and not these random
 numbers? And is it possible to store the files as plain text instead
 of binary?
 
 Thanks,
 Tim
 
 
 Node root = session.getRootNode();
 Node fileNode = root.addNode(hi.txt, nt:file);
 Node resNode = fileNode.addNode(jcr:content, nt:resource);
 resNode.setProperty(jcr:mimeType, text/plain);
 resNode.setProperty(jcr:encoding, utf8);
 resNode.setProperty(jcr:data, new FileInputStream(new 
File(hi.txt)));
 session.save();

--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Observations of SQL-2 queries [SEC=UNCLASSIFIED]

2010-03-31 Thread Ross . Dyson
I have an archive of PDF documents, with metadata.  Mostly I will find the 
documents by traversing the tree, but sometimes I will need to get them by 
their unique identifier (from the source system) which I use for the 
file's node name and as a property on the jcr_content subnode.

Here is my data structure:

folder/folder/nt:file name = 282675/ed:docreftype 
name=jcr:contentproperty name=jcr:data value=binary
 property name=ed:document_id value=282675
 ..other properties..

about 100K records, no folder contains  99 nodes

select * from [nt:file] as doc where doc.name = '282675'
takes 30 seconds finds nothing

select * from [nt:file] as doc inner join [ed:docreftype] as content on 
ischildnode(content, doc)
where content.[ed:document_id] = '282675'
takes 50 seconds finds the right record

select * from [nt:file] as doc inner join [ed:docreftype] as content on 
ischildnode(content, doc) 
where contains(content.[ed:document_id], '282675') and 
content.[ed:document_id] = '282675'
takes 2 seconds finds the correct record.

So getting the Lucene index to do a first cut of the results helps the 
whole process.

Prophecy:
He who pulls the mighty sword ExQueryString from the stone JSR-283 shall 
be the rightful king of all Content.

--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: NoClassDefFoundError accessing jackrabbit jar from servlet [SEC=UNCLASSIFIED]

2010-03-30 Thread Ross . Dyson
Righto, then add jackrabbit-api-2.0.0.jar and 
jackrabbit-jcr-commons-2.0.0.jar into the same directory and see what 
happens.

(This is what I have in my servlet application).

Rob Brown rlb.so...@gmail.com wrote on 30/03/2010 05:39:49 PM:

 From: Rob Brown rlb.so...@gmail.com
 To: users@jackrabbit.apache.org
 Date: 30/03/2010 05:40 PM
 Subject: Re: NoClassDefFoundError accessing jackrabbit jar from 
 servlet  [SEC=UNCLASSIFIED]
 
 Good question - this results in a different error:
 
 java.lang.NoClassDefFoundError: 
org/apache/jackrabbit/core/TransientRepository
 
 
 
 
 On Mon, Mar 29, 2010 at 11:46 PM, ross.dy...@ipaustralia.gov.au wrote:
 
  What happens if you put the specific jar file jcr-2.0.jar in the
  WEB-INF/lib directory?
 
 
 
 
  Rob Brown rlb.so...@gmail.com wrote on 30/03/2010 03:13:34 PM:
 
   From: Rob Brown rlb.so...@gmail.com
   To: users@jackrabbit.apache.org
   Date: 30/03/2010 03:14 PM
   Subject: NoClassDefFoundError accessing jackrabbit jar from servlet
  
   Hello,
  
   My first attempt at sending this failed b/c I attached a zip file. 
Sorry
  if
   this is a duplicate message for some.
  
   I'm getting the above mentioned error when trying to access a 
repository
   using jackrabbit-standalone-2.0.0.jar from a servlet. I didn't use 
the
   jackrabbit war because I already have a thick client app working and 
I
  want
   to reuse as much code as possible. I just assumed doing this was
  possible.
  
   If I do the exact same thing from a thick client (Swing) window the 
error
   does not occur. The problem seems to be related to class loading 
from a
   servlet. I have got the same results in Tomcat 5.5 and in the 
eclipse
   embedded web server.
  
   To test I created a small web application. Since I cannot attach a 
zip
  file
   I will just copy the doPost() method below:
  
   protected void doPost(HttpServletRequest request, 
HttpServletResponse
   response) throws ServletException, IOException {
  
   Repository repository = new TransientRepository(
   repository.xml, //embedded within the war
   path/to/home/dir);
   Session session = null;
   try {
   session = repository.login();
   System.out.println(root node identifier:  +
   session.getRootNode().getIdentifier());
   } catch (Exception e) {
   e.printStackTrace();
   } finally {
   session.logout();
   }
   }
  
   When I post to this servlet from an html form the exception has 2 
parts:
   java.lang.NoClassDefFoundError: javax/jcr/Repository
   java.lang.ClassNotFoundException: javax.jcr.Repository
  
   I found a link to a similar issue that may relate to what's 
happening:
   http://www.eclipse.org/forums/index.php?t=treegoto=87658#page_top
  
   The first reply to this query says:
   That can happen e.g. if your class is found but it executes a 
static
   initialiser (i.e. a public static final assignment) that uses 
another
  class
   that's not exported by the system bundle.
  
   I'm not using Equinox or anything other than Eclipse for EE 
developers.
   Perhaps it's related to a bundle issue within jackrabbit.
  
   Is accessing a repository in this way from a servlet (i.e. using the
   jackrabbit jar not war) not a supported function, or just not a good 
idea
  in
   general? My goals are to keep the code as simple as possible and 
minimize
   the attack surface for troublemakers (i.e. I do not want to expose 
REST
  or
   any other jackrabbit servlet api to an experienced hacker who might
   recognize what library I am using). I'm handling all user 
authentication
  in
   my app already and only want my java domain objects to make calls to 
the
   repository.
  
   Thanks in advance for any assistance you can offer. Comments about 
the
   security implications of this approach vs. the jackrabbit war are 
also
  most
   welcome.
  
   Rob
 
 
  --
  This message contains privileged and confidential information only
  for use by the intended recipient.  If you are not the intended
  recipient of this message, you must not disseminate, copy or use
  it in any manner.  If you have received this message in error,
  please advise the sender by reply e-mail.  Please ensure all
  e-mail attachments are scanned for viruses prior to opening or
  using.
 
 

--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: NoClassDefFoundError accessing jackrabbit jar from servlet [SEC=UNCLASSIFIED]

2010-03-30 Thread Ross . Dyson
Also better add jackrabbit-core-2.0.0.jar.



Rob Brown rlb.so...@gmail.com wrote on 30/03/2010 06:05:29 PM:

 From: Rob Brown rlb.so...@gmail.com
 To: users@jackrabbit.apache.org
 Date: 30/03/2010 06:06 PM
 Subject: Re: NoClassDefFoundError accessing jackrabbit jar from 
 servlet  [SEC=UNCLASSIFIED]
 
 Seems to cause the same error. I'll test some more tomorrow but it's 
getting
 late here...
 
 Thanks for the suggestion.
 
 
 On Tue, Mar 30, 2010 at 12:44 AM, ross.dy...@ipaustralia.gov.au wrote:
 
  Righto, then add jackrabbit-api-2.0.0.jar and
  jackrabbit-jcr-commons-2.0.0.jar into the same directory and see what
  happens.
 
  (This is what I have in my servlet application).
 
  Rob Brown rlb.so...@gmail.com wrote on 30/03/2010 05:39:49 PM:
 
 
   From: Rob Brown rlb.so...@gmail.com
 
   To: users@jackrabbit.apache.org
   Date: 30/03/2010 05:40 PM
   Subject: Re: NoClassDefFoundError accessing jackrabbit jar from
   servlet  [SEC=UNCLASSIFIED]
  
   Good question - this results in a different error:
  
   java.lang.NoClassDefFoundError:
  org/apache/jackrabbit/core/TransientRepository
  
  
  
  
   On Mon, Mar 29, 2010 at 11:46 PM, ross.dy...@ipaustralia.gov.au 
wrote:
  
What happens if you put the specific jar file jcr-2.0.jar in the
WEB-INF/lib directory?
   
   
   
   
Rob Brown rlb.so...@gmail.com wrote on 30/03/2010 03:13:34 PM:
   
 From: Rob Brown rlb.so...@gmail.com
 To: users@jackrabbit.apache.org
 Date: 30/03/2010 03:14 PM
 Subject: NoClassDefFoundError accessing jackrabbit jar from 
servlet

 Hello,

 My first attempt at sending this failed b/c I attached a zip 
file.
  Sorry
if
 this is a duplicate message for some.

 I'm getting the above mentioned error when trying to access a
  repository
 using jackrabbit-standalone-2.0.0.jar from a servlet. I didn't 
use
  the
 jackrabbit war because I already have a thick client app working 
and
  I
want
 to reuse as much code as possible. I just assumed doing this was
possible.

 If I do the exact same thing from a thick client (Swing) window 
the
  error
 does not occur. The problem seems to be related to class loading 
from
  a
 servlet. I have got the same results in Tomcat 5.5 and in the 
eclipse
 embedded web server.

 To test I created a small web application. Since I cannot attach 
a
  zip
file
 I will just copy the doPost() method below:

 protected void doPost(HttpServletRequest request,
  HttpServletResponse
 response) throws ServletException, IOException {

 Repository repository = new TransientRepository(
 repository.xml, //embedded within the war
 path/to/home/dir);
 Session session = null;
 try {
 session = repository.login();
 System.out.println(root node identifier:  +
 session.getRootNode().getIdentifier());
 } catch (Exception e) {
 e.printStackTrace();
 } finally {
 session.logout();
 }
 }

 When I post to this servlet from an html form the exception has 
2
  parts:
 java.lang.NoClassDefFoundError: javax/jcr/Repository
 java.lang.ClassNotFoundException: javax.jcr.Repository

 I found a link to a similar issue that may relate to what's
  happening:
 
http://www.eclipse.org/forums/index.php?t=treegoto=87658#page_top

 The first reply to this query says:
 That can happen e.g. if your class is found but it executes a 
static
 initialiser (i.e. a public static final assignment) that uses 
another
class
 that's not exported by the system bundle.

 I'm not using Equinox or anything other than Eclipse for EE
  developers.
 Perhaps it's related to a bundle issue within jackrabbit.

 Is accessing a repository in this way from a servlet (i.e. using 
the
 jackrabbit jar not war) not a supported function, or just not a 
good
  idea
in
 general? My goals are to keep the code as simple as possible and
  minimize
 the attack surface for troublemakers (i.e. I do not want to 
expose
  REST
or
 any other jackrabbit servlet api to an experienced hacker who 
might
 recognize what library I am using). I'm handling all user
  authentication
in
 my app already and only want my java domain objects to make 
calls to
  the
 repository.

 Thanks in advance for any assistance you can offer. Comments 
about
  the
 security implications of this approach vs. the jackrabbit war 
are
  also
most
 welcome.

 Rob
   
   
--
This message contains privileged and confidential information only
for use by the intended recipient.  If you are not the intended
recipient of this message, you must not disseminate, copy or use
it in any manner.  If you have received this message in 

Re: Jackrabbit and web application [SEC=UNCLASSIFIED]

2010-03-29 Thread Ross . Dyson
I am building my application with the rar file (for easy replaceability 
with JCR updates and encapsulation of all the necessary jars) and a web 
application that supplies specific APIs (REST and SOAP) plus some nasty 
jsp pages for drilling down through the repository to make sure things are 
where they are supposed to be, to exercise the API for testing etc.

My application is much like an archive, stuff goes in and then gets 
viewed, the only updates are more additions.  My API has only a few 
possible calls.
I have thrown in 100K documents as a starter and it goes great.

Ross


Dhrubo dhrubo.ka...@gmail.com wrote on 30/03/2010 02:10:24 PM:

 From: Dhrubo dhrubo.ka...@gmail.com
 To: users@jackrabbit.apache.org
 Date: 30/03/2010 02:11 PM
 Subject: Re: Jackrabbit and web application
 
 How about embedding the repository in an web app and exposing remoting 
API -
 Rest , SOAP etc for applications who are interested?
 I am sure this has been tried. Comments will be highly appreciated.
 
 
 On Mon, Mar 29, 2010 at 10:37 PM, ChadDavis 
chadmichaelda...@gmail.comwrote:
 
  On Mon, Mar 29, 2010 at 10:54 AM, Dhrubo dhrubo.ka...@gmail.com 
wrote:
   Chad -
   some examples / source will be appreciated
 
  The jackrabbit site has examples for all of this.  The remoting wiki
  page is the best source for the remoting strategies, and the same
  JVM method is well documented also, again there are several options.
  They all make use of a servlet that is found in the jackrabbit-webapp
  module; this servlet inits the repository and makes it available in
  the servlet container.
 
 
 
  
   I guess embeded one is fastest but is there any data to prove it?
 
  There's no network layer . . . I'm not sure you need data to prove
  that it would be faster.  But if you need specifics, you'll probably
  have to produce them yourself.  I'd be interested to see the numbers
  you come up with.  I'm going to release my own numbers when we get
  tests that would be meaningful to the community.
 
  
   ~ dhrubo
  
   On Mon, Mar 29, 2010 at 10:15 PM, ChadDavis 
chadmichaelda...@gmail.com
  wrote:
  
   On Sun, Mar 28, 2010 at 8:44 AM, Ilya Skorik i...@skorik.me 
wrote:
   
Hello.
   
I plan to use jackrabbit in a Web application. Prompt, what 
strategy
  is
necessary for selecting for the connection organisation to base?
   
  
   There are many options.  Actually, it seems that a lot of people 
are
   running the repository inside the webapp.  My team insists on a 
server
   style deployment of the repository, and I found that the spi/davex
   remoting is the most robust at this time; perhaps it's even 
considered
   the preferred remoting method.
  
   See:
  
   http://wiki.apache.org/jackrabbit/RemoteAccess
  
   Note, there's an RMI method on that same page but it's considered 
to
   lack production performance, and, in Jackrabbit 2.0, it's not even
   fully implemented.
  
  
  
Considering specificity of Web applications when the server 
handles
  set
   of
queries from different users.
   
Whether there is something like connection pooling for 
jackrabbit?
--
View this message in context:
  
  
http://n4.nabble.com/Jackrabbit-and-web-application-tp1694146p1694146.html
Sent from the Jackrabbit - Users mailing list archive at 
Nabble.com.
   
  
  
  
  
   --
   Thanks ... Dhrubo
   My Book - http://www.apress.com/book/view/1430210095
  
   My Blog -
   http://www.jtraining.com/blogs/blogger/dhrubo/
  
   LinkedIn - http://www.linkedin.com/in/dhrubo
  
 
 
 
 
 -- 
 Thanks ... Dhrubo
 My Book - http://www.apress.com/book/view/1430210095
 
 My Blog -
 http://www.jtraining.com/blogs/blogger/dhrubo/
 
 LinkedIn - http://www.linkedin.com/in/dhrubo

--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: NoClassDefFoundError accessing jackrabbit jar from servlet [SEC=UNCLASSIFIED]

2010-03-29 Thread Ross . Dyson
What happens if you put the specific jar file jcr-2.0.jar in the 
WEB-INF/lib directory?




Rob Brown rlb.so...@gmail.com wrote on 30/03/2010 03:13:34 PM:

 From: Rob Brown rlb.so...@gmail.com
 To: users@jackrabbit.apache.org
 Date: 30/03/2010 03:14 PM
 Subject: NoClassDefFoundError accessing jackrabbit jar from servlet
 
 Hello,
 
 My first attempt at sending this failed b/c I attached a zip file. Sorry 
if
 this is a duplicate message for some.
 
 I'm getting the above mentioned error when trying to access a repository
 using jackrabbit-standalone-2.0.0.jar from a servlet. I didn't use the
 jackrabbit war because I already have a thick client app working and I 
want
 to reuse as much code as possible. I just assumed doing this was 
possible.
 
 If I do the exact same thing from a thick client (Swing) window the 
error
 does not occur. The problem seems to be related to class loading from a
 servlet. I have got the same results in Tomcat 5.5 and in the eclipse
 embedded web server.
 
 To test I created a small web application. Since I cannot attach a zip 
file
 I will just copy the doPost() method below:
 
 protected void doPost(HttpServletRequest request, 
HttpServletResponse
 response) throws ServletException, IOException {
 
 Repository repository = new TransientRepository(
 repository.xml, //embedded within the war
 path/to/home/dir);
 Session session = null;
 try {
 session = repository.login();
 System.out.println(root node identifier:  +
 session.getRootNode().getIdentifier());
 } catch (Exception e) {
 e.printStackTrace();
 } finally {
 session.logout();
 }
 }
 
 When I post to this servlet from an html form the exception has 2 parts:
 java.lang.NoClassDefFoundError: javax/jcr/Repository
 java.lang.ClassNotFoundException: javax.jcr.Repository
 
 I found a link to a similar issue that may relate to what's happening:
 http://www.eclipse.org/forums/index.php?t=treegoto=87658#page_top
 
 The first reply to this query says:
 That can happen e.g. if your class is found but it executes a static
 initialiser (i.e. a public static final assignment) that uses another 
class
 that's not exported by the system bundle.
 
 I'm not using Equinox or anything other than Eclipse for EE developers.
 Perhaps it's related to a bundle issue within jackrabbit.
 
 Is accessing a repository in this way from a servlet (i.e. using the
 jackrabbit jar not war) not a supported function, or just not a good 
idea in
 general? My goals are to keep the code as simple as possible and 
minimize
 the attack surface for troublemakers (i.e. I do not want to expose REST 
or
 any other jackrabbit servlet api to an experienced hacker who might
 recognize what library I am using). I'm handling all user authentication 
in
 my app already and only want my java domain objects to make calls to the
 repository.
 
 Thanks in advance for any assistance you can offer. Comments about the
 security implications of this approach vs. the jackrabbit war are also 
most
 welcome.
 
 Rob

--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: jackrabbit 2.0 binary search indexing [SEC=UNCLASSIFIED]

2010-02-18 Thread Ross . Dyson
My binary files are all PDFs, so the text is extracted with PdfBox toolkit 
and the full text becomes keyword searchable.
All done using the default configuration, except I extended nt:resource to 
add a few attributes.

The mimeType attribute will be application/octet-stream. 
Perhaps there is no plug-in that knows how to extract text from your 
binary files?




From:   ChadDavis chadmichaelda...@gmail.com
To: users@jackrabbit.apache.org
Date:   19/02/2010 11:13 AM
Subject:Re: jackrabbit 2.0 binary search indexing



On Thu, Feb 18, 2010 at 2:39 PM, Alexander Klimetschek aklim...@day.com 
wrote:
 On Thu, Feb 18, 2010 at 18:35, ChadDavis chadmichaelda...@gmail.com 
wrote:
 I'm looking for information on how to enable binary search indexing.
 I found documentation for pre-2.0 jackrabbit, and reference to the
 fact that Tika is now used internally for the binary indexing.
 However, I can't find any documentation of how to enable the binary
 indexing . . ..

 It is enabled for all nt:file binaries, ie. the jcr:content/jcr:data
 property. The mimetype for text extraction is taken from the
 jcr:content/jcr:mimeType property. I don't know if you can enable it
 for other binary properties.


Just to clarify, you are saying that the binary indexing, as long as
I'm using the JCR built-in node types for my binary file storage, e.g.
nt:file -- jcr:content nt:resource --jcr:data ( binary property
with my file ), occurs automatically?

If so, then something's not working for me.  Can you recommend some
troubleshooting tips?  How can I determine whether the binaries are
being indexed?  Note, I'm doing a full text search and it DOES hit
other node properties, etc.


--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: jackrabbit 2.0 binary search indexing [SEC=UNCLASSIFIED]

2010-02-18 Thread Ross . Dyson
I only have a small dataset in my test application (100 docs), it 
certainly only takes a few seconds to be available for the keyword search.

ChadDavis chadmichaelda...@gmail.com wrote on 19/02/2010 11:33:27 AM:

 From: ChadDavis chadmichaelda...@gmail.com
 To: users@jackrabbit.apache.org
 Date: 19/02/2010 11:34 AM
 Subject: Re: jackrabbit 2.0 binary search indexing [SEC=UNCLASSIFIED]
 
 On Thu, Feb 18, 2010 at 5:30 PM,  ross.dy...@ipaustralia.gov.au wrote:
  My binary files are all PDFs, so the text is extracted with PdfBox 
toolkit
  and the full text becomes keyword searchable.
  All done using the default configuration, except I extended 
nt:resource to
  add a few attributes.
 
  The mimeType attribute will be application/octet-stream.
  Perhaps there is no plug-in that knows how to extract text from your 
binary
  files?
 
 I tried pdf, word, and a plain text file . . . how long does it take
 for a doc to be indexed?
 
 
 
 
 
  From:ChadDavis chadmichaelda...@gmail.com
  To:users@jackrabbit.apache.org
  Date:19/02/2010 11:13 AM
  Subject:Re: jackrabbit 2.0 binary search indexing
  
 
 
  On Thu, Feb 18, 2010 at 2:39 PM, Alexander Klimetschek 
aklim...@day.com
  wrote:
  On Thu, Feb 18, 2010 at 18:35, ChadDavis chadmichaelda...@gmail.com
  wrote:
  I'm looking for information on how to enable binary search indexing.
  I found documentation for pre-2.0 jackrabbit, and reference to the
  fact that Tika is now used internally for the binary indexing.
  However, I can't find any documentation of how to enable the binary
  indexing . . ..
 
  It is enabled for all nt:file binaries, ie. the jcr:content/jcr:data
  property. The mimetype for text extraction is taken from the
  jcr:content/jcr:mimeType property. I don't know if you can enable it
  for other binary properties.
 
 
  Just to clarify, you are saying that the binary indexing, as long as
  I'm using the JCR built-in node types for my binary file storage, e.g.
  nt:file -- jcr:content nt:resource --jcr:data ( binary property
  with my file ), occurs automatically?
 
  If so, then something's not working for me.  Can you recommend some
  troubleshooting tips?  How can I determine whether the binaries are
  being indexed?  Note, I'm doing a full text search and it DOES hit
  other node properties, etc.
 
 
 
  --
  This message contains privileged and confidential information only
  for use by the intended recipient.  If you are not the intended
  recipient of this message, you must not disseminate, copy or use
  it in any manner.  If you have received this message in error,
  please advise the sender by reply e-mail.  Please ensure all
  e-mail attachments are scanned for viruses prior to opening or
  using.
 
 

--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Re: repo browser [SEC=UNCLASSIFIED]

2010-02-17 Thread Ross . Dyson
I deployed on Weblogic 10.3.1, but I only get 403 errors, can't see 
anything in the logs, when I try to go to the default page.

Ross.



Rakesh Vidyadharan rak...@sptci.com wrote on 17/02/2010 05:14:31 AM:

 From: Rakesh Vidyadharan rak...@sptci.com
 To: users@jackrabbit.apache.org
 Date: 17/02/2010 05:15 AM
 Subject: Re: repo browser
 
 
 On 16 Feb 2010, at 11:52, Patricio Echagüe wrote:
 
  Hey Rakesh, thank you so much. It works great.
 
 You are welcome.  I released 2.1 a couple of hours ago, with some 
 additional 2.0 features as well as RMI connection ability.
 
 Rakesh
 
  
  On Mon, Feb 15, 2010 at 1:06 PM, Rakesh Vidyadharan 
rak...@sptci.comwrote:
  
  
  On 15 Feb 2010, at 14:00, Patricio Echagüe wrote:
  
  Do you guys have any recommendations for Jackrabbit 2.0 (JSR 283) ?
  The explorers that are mentioned in those links throw an exception
  complaining about 2.0 dtd when trying to open up the repository.xml
  
  http://kenai.com/projects/jcrmanager 2.0 is for JR 2.0
  
  Rakesh
  
  
  
  
  -- 
  Patricio.-
 
 Rakesh Vidyadharan
 President  CEO
 Sans Pareil Technologies, Inc.
 http://sptci.com/
 
 
 | 100 W. Chestnut, Suite 1305 | Chicago, IL 60610-3296 USA |
 | Ph: +1 (312) 212 3933 | Mobile: +1 (312) 315-1596 (US), +91  949 
 611 0873 (IN) | Fax: +1 (312) 276-4410 | E-mail: rak...@sptci.com
 
 
 
 

--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Suitability of jackrabbit for requirements [SEC=UNCLASSIFIED]

2009-12-08 Thread Ross . Dyson
Hi all

I have a requirement to archive several million documents (PDFs) with some 
variable metadata.  The archive process is the only thing that rights
--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.


Suitability of jackrabbit for requirements [SEC=UNCLASSIFIED]

2009-12-08 Thread Ross . Dyson
Hi all

I have a requirement to archive several million documents with variable 
metadata (dates, types of document).  There is only the update process 
that adds documents daily, and perhaps some very few updates manually 
(depending on resolution of requirements).  Documents will be retrieved by 
their position in the tree, plus some filtering on the metadata.

I was planning to deploy jackrabbit with a war that implements a 
simplified web service for the update process and the retrieval. The 
searching should not be very heavily used, peaking at a few calls per 
second, max.  I sold it to management with the promise of scaling up to a 
licensed implementation if/when necessary.

Couple million docs, terabyte of data, modest throughput, availability of 
upgrade path, easy backups.  From my reading/lurking, this sounds like a 
job for bundle persistence manager using H2 database.

Have I missed something?  Does this sound like a workable plan?

Thanks for comments

Ross.


--
This message contains privileged and confidential information only 
for use by the intended recipient.  If you are not the intended 
recipient of this message, you must not disseminate, copy or use 
it in any manner.  If you have received this message in error, 
please advise the sender by reply e-mail.  Please ensure all 
e-mail attachments are scanned for viruses prior to opening or 
using.