> "Any Default"
> Good question, I don't think we should index all properties by default:
> I guess we should ask that tot he Lucene community.

I think no-index is an apropriate default; but maybe a @IndexAll would be  
relevant ?
(not sure though)

> As for the bridges
> (ie types), they are defaulted, there is an heuristic guess mechanism.

sounds good.

> The default is FSDirectory with the base directory being ".".
> RAMDirectory is of little use except for some specific usecases and for
> unit testing

Makes sense.

> I'm not a big fan of exposing the Lucene result itself but the relevance
> is something useful, I need to thing about that: the main problem is
> that I currently hide some of the plumbering to the user esp the
> searcher opening and closing, by doing so, there is no way to give the
> Hits (Lucene results).
> The ordering is preserved when returned by Hibernate.

How about turning this upside-down and let the user execute the query and
thus he have access to the lucene API query and could do something like:

session.createLucenceQuery(lucenceapiquery q).list() ?

> "why session.index() a specific operation?"
> Here is my reasoning:
>  - using a lucene query to index a non index object is going to be hard
> since the lucene query will not return the object in the first place ;-)

details ;)

>  - using a regular Hibernate query + a flag to index the objects suffers
> the OOME issue unless we use the stateless session. If I use the
> stateless session, I can't use the event system...

Why does this give OOME ? If you query for one object + flag it should be  
just
as heavy/light as index(), right ?

>  - From what I've seen and guessed, what you want to (re)index is very
> business specific  and can be way more complex than just a query

mkay.

> "Session delegation and callbacks"
> Yes but Event Listeners are the current way to have a callback to the
> session. Event Listeners are stateless, the state being part of the  
> events.
> What we need is a way to push / keep some informations at the event /
> PersistenceContext level. The SessionDelegate would be another way to
> keep some state but make it hard to push the info to the eventlisteners

Yes, that would allow you to the #3 i talked about - having a  
SessionDelegate
that the underlying Session could call back to with enough context to  
allow you
to maintain it.

/max

> Max Rydahl Andersen wrote:
>>
>> Hi Emmanuel,
>>
>> Here are my comments (sorry if something is obvious from looking at the
>> code,
>> but haven't had time to look into the details yet)
>>
>> > *Concepts*
>> > Each time you change an object state, the lucene index is updated and
>> > kept in sync. This is done through the Hibernate event system.
>>
>> Ok - sounds cool. The index is updated at flush or commit time ? (i
>> assume
>> commit)
>>
>> > Whether an entity is indexed or not and whether a property is indexed  
>> or
>> > not is defined through annotations.
>>
>> Any defaults?
>>
>> > You can also search through your domain model using Lucene and  
>> retrieve
>> > managed objects. The whole idea here is to do a nice integration  
>> between
>> > the search engine and the ORM without loosing the search engine power,
>> > hence most of the API remains. To sum up, query Lucene, get managed
>> > object back.
>>
>> Cool.
>>
>> > *Mapping*
>> > A given entity is mapped to an index. A lucene index is stored in a
>> > Directory, a Directory is a Lucnee abstract concept for index storage
>> > system. It can be a memory directory (RAMDirectory), a file system
>> > directory (FSDirectory) or any other kind of backend. Hibernate Lucene
>> > introduce the notion of DirectoryProvider that you can configure and
>> > define on a per entity basis (and wich is defaulted defaulted). The
>> > concept is very similar to ConnectionProvider.
>>
>> defaulted defaulted ? (defaulted to RAMDirectory maybe ?)
>>
>> > Lucene only works with Strings, so you can define a @FieldBridge which
>> > transform a java property into a Lucene Field (and potentially
>> > vice-versa). A more simple (useful?) version handle the transformation
>> > of a java property into a String.
>> > Some built-in FieldBrigde exists. @FieldBridge is very much like an
>> > Hibernate Type. Esp I introduced the notion of precision in dates  
>> (year,
>> > month, .. second, millisecond). This FieldBridge and StringBridge  
>> gives
>> > a lot of flexibility in the way to design the property indexing.
>>
>> Sounds like a good thing.
>>
>> > *Querying*
>> > I've introduced the notion of LuceneSession which implements Session  
>> and
>> > actually delegates to a regular Hibernate Session. This lucene session
>> > has a /createLuceneQuery()/ method and a /index()/ method.
>> >
>> > /session.createLuceneQuery(lucene.Query, Class[])/ takes a Lucene  
>> query
>> > as a parameter and the list of targeted entities. Using a Lucene query
>> > as a parameter gives the full Lucene flexibility (no abstraction on  
>> top
>> > of it). An /org.hibernate.Query/ object is returned.
>> > You can (must) use pagination. A Lucene query also return the number  
>> of
>> > matching results (regardless of the pagination): query.resultSize()  
>> sort
>> > of count(*).
>>
>> Is there any way to get to the underlying lucene result ?
>> As far as I remember Lucence also have some notion of result relevance
>> and
>> ordering
>> which could be relevant to reach ?
>>
>> > Having the dynamic fetch profile would definitely be a killer pair
>> > (searching the lucene index, and fetching the appropriate object  
>> graph)
>>
>> +1000 ;)
>>
>> > /session.//index(Object)/ is currently not implemented it requires  
>> some
>> > modifications of SessionImpl or of LuceneSession. This feature is  
>> useful
>> > to initialize / refresh the index in a batch way (ie loading the data
>> > and applying the indexing process on this set of data).
>> > Basically the object is added to the index queue. At flush() time, the
>> > queue is processed.
>>
>> hmm...why is this specific operation needed if it is done automatically
>> on object changes ?
>>
>> And if it is something you want to allow users to index not-yet-indexed
>> object
>> couldn't it be a flag or something on the LuceneQuery ?
>>
>> e.g. s.createLuceneQuery("from X as x where x....").setIndex(true) or
>> maybe .setIndex(IndexMode.ONLY_NEW);
>>
>> > design considerations:
>> > The delegation vs subclassing strategy for LuceneSession (ie
>> > LuceneSession delegating to a regular Session allowing simple wrapping
>> > or the LuceneSessionImpl being a subclass of SessionImpl is an ongoing
>> > discussion.
>>
>> > Using a subclassing model would allow the LuceneSession to keep
>> > operation queues (for batch indexing either through object changes or
>> > through session.index() ), but it does not allow a potential  
>> Hibernate -
>> > XXX integration on the same subclassing model. Batching is essential  
>> in
>> > Lucene for performance reasons.
>> > Using the delegation model requires some SessionImpl modifications to  
>> be
>> > able to keep track of a generic context. This context will keep the
>> > operation queues.
>> >
>> >
>> > *ToDo*
>> > Argue on the LuceneSession design are pick up one (Steve/Emmanuel/Feel
>> > free to join the danse)
>>
>> I vote for a impl that will allow an existing Session to be the basis of
>> extension;
>> thus not having Lucene integrating be a hardcoded subclass....we did the
>> same
>> for Configuration and that is smelly/inflexible.
>>
>> We should open enough of the session up to allow such delegation.
>>
>> This might be extremely hard and close to impossible, but that is what I
>> wish for ;)
>>
>> > Find a way to keep the DocumentBuilder (sort of EntityPersister) at  
>> the
>> > SessionFactory level rather than the EventListener level
>> (Steve/Emmanuel)
>>
>> Finding a way of storing structured info/data relatively to some of the
>> core concepts
>> in SF would be usefull for other things than Lucene integration (E.g.
>> other search, query, tooling impls etc)
>>
>> > Batch changes: to do that I need to be able to keep a session related
>> > queue of all insert/update changes. I can't in the current design
>> > because SessionImpl does not have such concept and because the
>> > LuceneSession is build on the delegation model. We need to discuss the
>> > strategy here (delegation vs subclassing)
>>
>> Isn't there three strategies ?
>>
>> The current one is LuceneSession delegates to Session
>>
>> The other one is LuceneSession extends Session
>>
>> The one I see as third is that LuceneSession delegates to Session, but  
>> on
>> that Session we install callbacks so the LuceneSession (and friends) can
>> maintain/participate in some of these state handling scenarioes ?
>>
>> (maybe that is what you mean by delegation, but just wanted to be sure)
>>
>> > Massive batch changes: in some system, we don't really bother with  
>> "real
>> > time" index synchronization, for those a common batch changes queue  
>> (ie
>> > across several sessions) would make sense with a queue flushing
>> > happening every hour  for example.
>>
>> Isn't that something related/similar to having a very non-strict cache
>> with a large timeout ?
>>
>> > Clustered Directory: think about that. A JDBC Directory might not be  
>> the
>> > perfect solution actually.
>>
>> Doesn't Lucene or some sister project provide clustering for Lucene yet  
>> ?
>>
>> > implements additional strategies to load object on query.list()
>>
>> what is this one ?
>>
>> --
>> --
>> Max Rydahl Andersen
>> callto://max.rydahl.andersen
>>
>> Hibernate
>> [EMAIL PROTECTED]
>> http://hibernate.org
>>
>> JBoss Inc
>> [EMAIL PROTECTED]
>>



-- 
--
Max Rydahl Andersen
callto://max.rydahl.andersen

Hibernate
[EMAIL PROTECTED]
http://hibernate.org

JBoss Inc
[EMAIL PROTECTED]


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
hibernate-devel mailing list
hibernate-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/hibernate-devel

Reply via email to