Re: What is an optimal approach?

Mindaugas Žakšauskas Mon, 30 Mar 2009 08:29:44 -0700

As a someone who earns for living on writing CMS system integrated
with Lucene I can tell you this is not that simple. You can of course
index your data, but be aware that all your subsequent content
repository operations should be in sync. Say what if a piece of
content is deleted from the CR? You probably don't want your search to
yield deleted content - you need to update your index not to include
it. Similar applies for all of CRUD operations. What if you want a
clustered solution? What about atomicity? The list goes on...

I can only second Mark, make sure you have exhausted all search
possibilities your current system has to offer.

To answer your question, I know nothing about MarkLogic API, but if
all your data is in XML, you always can parse it, select desired nodes
to be indexed and create a org.apache.lucene.document.Document from
it. At least that's what we do.

Regards,
Mindaugas

On Mon, Mar 30, 2009 at 3:46 PM, Shah, Yagnesh <ys...@hwwilson.com> wrote:
>
> Hello Lucene users,
>  We have all our xml documents stored in a content management system from 
> MarkLogic. Is there any best approach to index these documents via lucene?
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: What is an optimal approach?

Reply via email to