> "Any Default" > Good question, I don't think we should index all properties by default: > I guess we should ask that tot he Lucene community.
I think no-index is an apropriate default; but maybe a @IndexAll would be relevant ? (not sure though) > As for the bridges > (ie types), they are defaulted, there is an heuristic guess mechanism. sounds good. > The default is FSDirectory with the base directory being ".". > RAMDirectory is of little use except for some specific usecases and for > unit testing Makes sense. > I'm not a big fan of exposing the Lucene result itself but the relevance > is something useful, I need to thing about that: the main problem is > that I currently hide some of the plumbering to the user esp the > searcher opening and closing, by doing so, there is no way to give the > Hits (Lucene results). > The ordering is preserved when returned by Hibernate. How about turning this upside-down and let the user execute the query and thus he have access to the lucene API query and could do something like: session.createLucenceQuery(lucenceapiquery q).list() ? > "why session.index() a specific operation?" > Here is my reasoning: > - using a lucene query to index a non index object is going to be hard > since the lucene query will not return the object in the first place ;-) details ;) > - using a regular Hibernate query + a flag to index the objects suffers > the OOME issue unless we use the stateless session. If I use the > stateless session, I can't use the event system... Why does this give OOME ? If you query for one object + flag it should be just as heavy/light as index(), right ? > - From what I've seen and guessed, what you want to (re)index is very > business specific and can be way more complex than just a query mkay. > "Session delegation and callbacks" > Yes but Event Listeners are the current way to have a callback to the > session. Event Listeners are stateless, the state being part of the > events. > What we need is a way to push / keep some informations at the event / > PersistenceContext level. The SessionDelegate would be another way to > keep some state but make it hard to push the info to the eventlisteners Yes, that would allow you to the #3 i talked about - having a SessionDelegate that the underlying Session could call back to with enough context to allow you to maintain it. /max > Max Rydahl Andersen wrote: >> >> Hi Emmanuel, >> >> Here are my comments (sorry if something is obvious from looking at the >> code, >> but haven't had time to look into the details yet) >> >> > *Concepts* >> > Each time you change an object state, the lucene index is updated and >> > kept in sync. This is done through the Hibernate event system. >> >> Ok - sounds cool. The index is updated at flush or commit time ? (i >> assume >> commit) >> >> > Whether an entity is indexed or not and whether a property is indexed >> or >> > not is defined through annotations. >> >> Any defaults? >> >> > You can also search through your domain model using Lucene and >> retrieve >> > managed objects. The whole idea here is to do a nice integration >> between >> > the search engine and the ORM without loosing the search engine power, >> > hence most of the API remains. To sum up, query Lucene, get managed >> > object back. >> >> Cool. >> >> > *Mapping* >> > A given entity is mapped to an index. A lucene index is stored in a >> > Directory, a Directory is a Lucnee abstract concept for index storage >> > system. It can be a memory directory (RAMDirectory), a file system >> > directory (FSDirectory) or any other kind of backend. Hibernate Lucene >> > introduce the notion of DirectoryProvider that you can configure and >> > define on a per entity basis (and wich is defaulted defaulted). The >> > concept is very similar to ConnectionProvider. >> >> defaulted defaulted ? (defaulted to RAMDirectory maybe ?) >> >> > Lucene only works with Strings, so you can define a @FieldBridge which >> > transform a java property into a Lucene Field (and potentially >> > vice-versa). A more simple (useful?) version handle the transformation >> > of a java property into a String. >> > Some built-in FieldBrigde exists. @FieldBridge is very much like an >> > Hibernate Type. Esp I introduced the notion of precision in dates >> (year, >> > month, .. second, millisecond). This FieldBridge and StringBridge >> gives >> > a lot of flexibility in the way to design the property indexing. >> >> Sounds like a good thing. >> >> > *Querying* >> > I've introduced the notion of LuceneSession which implements Session >> and >> > actually delegates to a regular Hibernate Session. This lucene session >> > has a /createLuceneQuery()/ method and a /index()/ method. >> > >> > /session.createLuceneQuery(lucene.Query, Class[])/ takes a Lucene >> query >> > as a parameter and the list of targeted entities. Using a Lucene query >> > as a parameter gives the full Lucene flexibility (no abstraction on >> top >> > of it). An /org.hibernate.Query/ object is returned. >> > You can (must) use pagination. A Lucene query also return the number >> of >> > matching results (regardless of the pagination): query.resultSize() >> sort >> > of count(*). >> >> Is there any way to get to the underlying lucene result ? >> As far as I remember Lucence also have some notion of result relevance >> and >> ordering >> which could be relevant to reach ? >> >> > Having the dynamic fetch profile would definitely be a killer pair >> > (searching the lucene index, and fetching the appropriate object >> graph) >> >> +1000 ;) >> >> > /session.//index(Object)/ is currently not implemented it requires >> some >> > modifications of SessionImpl or of LuceneSession. This feature is >> useful >> > to initialize / refresh the index in a batch way (ie loading the data >> > and applying the indexing process on this set of data). >> > Basically the object is added to the index queue. At flush() time, the >> > queue is processed. >> >> hmm...why is this specific operation needed if it is done automatically >> on object changes ? >> >> And if it is something you want to allow users to index not-yet-indexed >> object >> couldn't it be a flag or something on the LuceneQuery ? >> >> e.g. s.createLuceneQuery("from X as x where x....").setIndex(true) or >> maybe .setIndex(IndexMode.ONLY_NEW); >> >> > design considerations: >> > The delegation vs subclassing strategy for LuceneSession (ie >> > LuceneSession delegating to a regular Session allowing simple wrapping >> > or the LuceneSessionImpl being a subclass of SessionImpl is an ongoing >> > discussion. >> >> > Using a subclassing model would allow the LuceneSession to keep >> > operation queues (for batch indexing either through object changes or >> > through session.index() ), but it does not allow a potential >> Hibernate - >> > XXX integration on the same subclassing model. Batching is essential >> in >> > Lucene for performance reasons. >> > Using the delegation model requires some SessionImpl modifications to >> be >> > able to keep track of a generic context. This context will keep the >> > operation queues. >> > >> > >> > *ToDo* >> > Argue on the LuceneSession design are pick up one (Steve/Emmanuel/Feel >> > free to join the danse) >> >> I vote for a impl that will allow an existing Session to be the basis of >> extension; >> thus not having Lucene integrating be a hardcoded subclass....we did the >> same >> for Configuration and that is smelly/inflexible. >> >> We should open enough of the session up to allow such delegation. >> >> This might be extremely hard and close to impossible, but that is what I >> wish for ;) >> >> > Find a way to keep the DocumentBuilder (sort of EntityPersister) at >> the >> > SessionFactory level rather than the EventListener level >> (Steve/Emmanuel) >> >> Finding a way of storing structured info/data relatively to some of the >> core concepts >> in SF would be usefull for other things than Lucene integration (E.g. >> other search, query, tooling impls etc) >> >> > Batch changes: to do that I need to be able to keep a session related >> > queue of all insert/update changes. I can't in the current design >> > because SessionImpl does not have such concept and because the >> > LuceneSession is build on the delegation model. We need to discuss the >> > strategy here (delegation vs subclassing) >> >> Isn't there three strategies ? >> >> The current one is LuceneSession delegates to Session >> >> The other one is LuceneSession extends Session >> >> The one I see as third is that LuceneSession delegates to Session, but >> on >> that Session we install callbacks so the LuceneSession (and friends) can >> maintain/participate in some of these state handling scenarioes ? >> >> (maybe that is what you mean by delegation, but just wanted to be sure) >> >> > Massive batch changes: in some system, we don't really bother with >> "real >> > time" index synchronization, for those a common batch changes queue >> (ie >> > across several sessions) would make sense with a queue flushing >> > happening every hour for example. >> >> Isn't that something related/similar to having a very non-strict cache >> with a large timeout ? >> >> > Clustered Directory: think about that. A JDBC Directory might not be >> the >> > perfect solution actually. >> >> Doesn't Lucene or some sister project provide clustering for Lucene yet >> ? >> >> > implements additional strategies to load object on query.list() >> >> what is this one ? >> >> -- >> -- >> Max Rydahl Andersen >> callto://max.rydahl.andersen >> >> Hibernate >> [EMAIL PROTECTED] >> http://hibernate.org >> >> JBoss Inc >> [EMAIL PROTECTED] >> -- -- Max Rydahl Andersen callto://max.rydahl.andersen Hibernate [EMAIL PROTECTED] http://hibernate.org JBoss Inc [EMAIL PROTECTED] ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ hibernate-devel mailing list hibernate-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/hibernate-devel