Re: lenya-search proposal

Michael Wechner Sun, 12 Jun 2005 23:02:09 -0700

Robert Goene wrote:

Hi,
Thanks for actually reading it and giving a thorough reply!
I think this could be done more general, such that every time a document
is changing is being indexed, e.g. also after editing, whereas therecould
be one index for the authoring and one index for the live area.
If the document changes, it will be reindexed. I don't really see theneed of a seperate index for every area.

within the authoring area the content can be quite different. Also therecan bedocuments which don't exist within the live area. I think it definitelymakessense to have different indices or rather being able to search on"different versions"

re workflow status.

I don't think it will be much more work to implement, but rather keepthe interfacegeneral enough and maybe just implement the live area if time it toolimited for you.

But I'd suggest that you rather drop some othe features and focuse on this.

I don't think a document should require a schema, but I guess we getinto a religious war here. But you can definitely not assume thateverything is validated by RelaxNG, because Lenya would close itselfbadly if it would neglect schemas like XSD and others ...
On the one hand you like the centralized definition of the index, asyou propose to add the indexing to the schema and on the other hand,you like to keep the schema requirement as flexible as possible. I seethe dilemma and that's why i think my idea is a nice way to keep somesort of flexibility on the schema side, but with a centralizeddefinition in the form of the samplefile.



sorry, I didn't understand that you were talking about a samplefile, but one
thing to think about is probably reoccuring elements and how to handle them.

Changing the fields would require a change of the 'obsolete' xmldocuments, but i think this is a rare case that should actually beavoided. Fields can be added or fields can become obsolete without aproblem, but changing a field is something that is done rarely, ifever. Could you give me a scenario where this would be an urgent problem?



let's say you have one title field, and then one adopts the schema that one

has a maintitle and a subtitle and title will be gone, whereas the titleis becoming

the maintitle

Well, this is just a first shot. I will probably change it, butsomething like this:


<pr>
<title>
<lenya:index>title</lenya:index>
Lenya 14 release preponed
</title>
<content>
<lenya:index>contents</lenya:index>
The release of Lenya 1.4, the Apache Content Management System, ladila
</content>
</pr>



how do you want to mark attributes?

how do you want to treat these external links?
I want to fetch the links in the document parser and let Nutch fetchthem when the scheduled index process will run. I am not sur yet if ican feed them to nutch directly or that i should add them to a textfile that nutch uses. I will give it another look.

I was actually rather thinking about how to do you want to handle themwithin the index, because they won't have the same fields as the oneswithin Lenya. Do you want

to create a separate index?

As far as i can see, it contains all the output one can ask for from aLucene query.



also pagening?

The nice thing is: it possible to scatter the result in differentpages. The links to all pages are delivered with the output. It lookspretty comprehensive to me.
Again, thanks for the reply!

no problem, thanks very much for working on it. Please don't be afraidof my comments (in case you are), but I just want to make sure thatvarious things are being

considered.

Thanks

Michi


Regards, Robert

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



--
Michael Wechner
Wyona      -   Open Source Content Management   -    Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
[EMAIL PROTECTED]                        [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: lenya-search proposal

Reply via email to