Re: Help for a new user

Ron Wheeler Sun, 17 Aug 2014 10:56:51 -0700

Lucene is embedded in Jackrabbit.
You should look at Solr (http://lucene.apache.org/solr/)

"Solr is a standalone enterprise search server with a REST-like API. Youput documents in it (called "indexing") via XML, JSON, CSV or binaryover HTTP. You query it via HTTP GET and receive XML, JSON, CSV orbinary results."

A repository might be helpful in organizing a document library inmultiple virtual hierarchies without duplication.

It also gives you a lot of flexibility in access control.

But you may not need one.

Ron

On 17/08/2014 1:27 PM, Julián wrote:

Thanks again for your response Ron.
It seems you're the one in the mailing list. Perhaps people are ontheir holidays.
I'm beginning to realize that I was wrong.
Because of your response, I've been looking for information and I'vefound Apache Lucene and Apache Tika. I have to try both, but it seemsthat they can work toghether for extracting and indexing files, andtika supports lots of formats.
I'm considering that I don't need to use jackrabbit for my applicationactually. Perhaps, I only need those tools to search inside the filesI want to store.I'm think I don't need a repository. I can save the properties of thefiles in a database, and the files in normal folders.I think it'd be pretty easy for me because I'm used to work withdatabases, but I've never worked with a repository.In fact, I was going to use the repository for its searchcapabilities, but I'm realizing that I don't need it.
I'm going to try with Lucene and Tika first.

Thanks a lot.

--------------------------------------------------
From: "Ron Wheeler" <[email protected]>
Sent: Saturday, August 16, 2014 8:09 PM
To: <[email protected]>
Subject: Re: Help for a new user
Some ideas that may be helpful.
If you want to search inside Jackrabbit using its internal searchengine, you are going to have to extract the text on the way in.I think that this means using the appropriate tool to read thecontent from the incoming document and creating a document linked tothe original that can be searched by Jackrabbit and then used to findthe original PDF or DOC or XLS, etc. to present to the user.
This should be possible for most of the common documents since thereare Apache tools such as POI that let you read DOC and XLS files andextract the content.
http://pdfbox.apache.org
http://poi.apache.org/
http://www.swftools.org/
https://wiki.openoffice.org/wiki/Xml
This can be a reasonably general solution if you add a facility thatallows users to manually write a document summary or keyword listfor documents in formats that you do not support or that do notcontain text that describes their content or usage - CAD drawings,Quickbooks backups, database backups, etc.
I hope that this gives you something to think about until a realJackrabbit expert shows up.
Ron

On 16/08/2014 12:49 PM, Julián wrote:
Hello.
I've been able to use a repository in my JSF application at last. Ifsomeone has a similar problem, I can help him.
Now, I would like to insert some files (.doc, .pdf, ...), and searchfor words into them, like google.I suposse that I'll have to use text extractors, and I'll have toconfigure the repository to index the files.
Does anybody know where I can find some examples?
Can anyone tell me where to look for?

Thanks


--------------------------------------------------
From: "Ron Wheeler" <[email protected]>
Sent: Sunday, August 10, 2014 7:09 PM
To: <[email protected]>
Subject: Re: Help for a new user
Did you get the example fromhttp://jackrabbit.apache.org/first-hops.html working?
You probably should get Eclipse working with Maven. That will getrid of some of the headaches.
If you want a fast way to get up and running with Exclipse andMaven try Eclipse STS. It is an Eclipse that comes out of the boxwith all the plug-ins that you need to develop Java applicationswith Maven.
This get rid of the need to set up software on classpaths manually.
Once you have the first hop demo working, you should be able tomake your simple web app.
At least you will have specific log messages to talk about.

Ron


On 10/08/2014 7:54 AM, Julián wrote:
(sorry for my english)
I'm very new at java, javaEE, web-development world, and, ofcourse, jackrabbit environment.I'm a student and I'm working in my degree project. An "easy"document management system.I only need users to get their documents and to be able to searchgroups of words into them (PDF, DOC, XLS ...) like a google search.I've heard about jackrabbit's benefits, so I've decided to use it.(I suposse jackrabbit can do those task ?)
I am developing an "easy" JSF application with Primefaces,Mysql... and now, I'm in the phase when I have to manage thedocuments.I've read the JSR 283 specification, and I undestand it more orless. My problem is how to begin.
I need someone to show me a simple example to create and access arepository. The repository only have to work with my applicationin a tomcat server.I've been looking for information on the Internet and I'mabsolutely lost. Everyone say different things. I haven't beenable to find an "easy" example about I need.In Jackrabbit's web, I've been reading about deployments models,stand-alone server, Jackrabbit Web application, Jackrabbit JCAResource Adapter ...Oh my god! Is it really so difficult what I want to do? I don'tthink so, perhaps I'm getting older...
I only need:
1º when a client access the application for the first time, therepository will be created in a specified path.
2º Clients will upload files, search for content, and download them.

I'm now in the first point. Can anyone help me?
I use the eclipse IDE and I don't use maven.
What "jars" must I include in my classpath?
what java instructions do I need to create and set up therepository? In the JSR specification, they use theRepositoryFactory class. Is it the way to do it?
Thanks a lot, and sorry for my ignorance.


---
Este mensaje no contiene virus ni malware porque la protección deavast! Antivirus está activa.
http://www.avast.com
--
Ron Wheeler
President
Artifact Software Inc
email: [email protected]
skype: ronaldmwheeler
phone: 866-970-2435, ext 102
---
Este mensaje no contiene virus ni malware porque la protección deavast! Antivirus está activa.
http://www.avast.com
--
Ron Wheeler
President
Artifact Software Inc
email: [email protected]
skype: ronaldmwheeler
phone: 866-970-2435, ext 102
---
Este mensaje no contiene virus ni malware porque la protección deavast! Antivirus está activa.
http://www.avast.com



--
Ron Wheeler
President
Artifact Software Inc
email: [email protected]
skype: ronaldmwheeler
phone: 866-970-2435, ext 102

Re: Help for a new user

Reply via email to