+1 for the idea of the database-based backend. Seems very useful. One idea for improvement may be to group on (entity-type,id) and only select the latest LuceneDatabaseWork per id. That way you'd avoid propagation of potentially outdated index updates.
2015-08-09 17:51 GMT+02:00 Sanne Grinovero <sa...@hibernate.org>: > Hi, > yes creating issues on JIRA is what we normally do, feel free to create them! > > They don't have to be sub-stasks; we use subtasks when there are > several "blocker" steps which need to be accomplished to get a feature > and it's so large that it needs to be split. > In this case I'd make a single JIRA for the new feature - which gets > resolved when we'll merge it in its most essential form - and further > improvement ideas can be created now or as needed as independent > improvements. > > Thanks! > Sanne > > On 9 August 2015 at 06:56, Flemming Harms <flemming.ha...@gmail.com> wrote: >> >> >> 2015-08-05 23:38 GMT+02:00 Sanne Grinovero <sa...@hibernate.org>: >>> >>> Hi Flemming, >>> welcome on this list! >>> >>> I waited a bit to reply myself, as you already know I like the >>> proposal. Unfortunately many others are on holidays, so other feedback >>> might be slow. >>> >>> Still I wouldn't let that slow you down and start the works for >>> merging it; I already anticipated over chat that this would come and >>> we all agree that the concept is useful! >>> I don't think others looked at the details yet, but if it comes to >>> concerns at that level, we can address smaller issues incrementally. >>> (I also didn't look at micro-details, as it's easier to comment on >>> those on a pull request). >>> >>> I had the same question as Martin regarding clustering: with the >>> current implementation you expect something like the master/slave >>> configuration, or Infinispan to be used as storage, correct? >>> I also think it would be interesting to explore the approach further >>> to also - optionally - serve as a replacement for these, but that's >>> another feature which is easier to experiment with after the core >>> concept is merged. >> >> >>> >>> Yes that's correct. To start with it was written very specific to the use >>> of Infinispan as directory. >> >> And I agree on we should explore other cluster configuration and I have some >> ideas how we can implement it. >>> >>> . >>> >>> >>> In short, I would simply merge your backend as a new module in >>> Hibernate Search! Fork our repository, and send a pull request. >>> >>> # Code layout / Modules >>> >>> In terms of code structure, you might have noticed that the module >>> 'hibernate-search-engine' (/engine in the source code) does not depend >>> on JPA nor Hibernate ORM; the reason is that other projects reuse the >>> core indexing strategy and the backends. Since it would be nice to >>> allow them to optionally use your backend, still not mandate a >>> dependency on ORM for those who don't, I think this should be a new >>> Maven module. >>> >>> We already have >>> /backends/jgroups >>> /backends/jms >>> >>> So we could add (name to be refined?) : >>> /backends/relationaldb >> >> sure, no problem :) >>> >>> >>> Also, your integration tests probably should be moved together with >>> our other integration tests. They are currently running WildFly >>> 10.0.0.Alpha6, but that shouldn't be a problem. >>> >>> # Code Style >>> >>> We use tabs ;-) >>> And also have various other "exotic" conventions regarding white-space >>> usage, right header files, etc.. >>> We use CheckStyle to keep it tidy, it will give you lots of errors and >>> when there are many it's not very helpful; I would suggest to take the >>> formatting templates attached at the following link and use your IDE's >>> formatting capabilities, resort to checkstyle just for the final >>> validation: >>> - http://hibernate.org/search/contribute/ >>> >>> # JDK >>> >>> It looks like your extension requires Java 8; if you could convert it >>> to Java 7 that would be nice. >> >> Don't think it will be an issue. As far I remember we don't use any Java8 >> specific functionality >>> >>> >>> # Rebasing to latest >>> >>> I'm afraid we're now aiming at Hibernate ORM 5, so some details might >>> need to be updated; probably just in the configuration area. We're >>> also in the process of upgrading to Apache Lucene 5, but that >>> shouldn't affect you at all. >>> >>> # Some improvement ideas >>> >>> While we should support the case in which Hibernate Search is not >>> being run as an extension of Hibernate ORM, that's likely the most >>> common one. >>> In that scenario I think it would be nice to be able to lookup the >>> existing ORM services so that users don't need to repeat for example >>> the datasource configuration. >>> >>> We might also be able to reuse all of the SessionFactory, but I'm not >>> sure how to include your model without it potentially interfering with >>> the end user's model; I'd say let's start by sharing some services >>> from ORM and then see what kind of improvements we can build into ORM >>> for this use case; for example this might simplify some of the >>> TransactionManager configuration code I'm seeing in your repository. >>> >>> Of course your existing configuration properties are useful too, >>> especially for the non-ORM case as we'll need be able to reuse the ORM >>> services. >>> >>> Also, you might have noticed we are now able to optionally include the >>> backend operations in the same transaction. That's not the default, as >>> commonly people don't want that, but it would be very interesting to >>> evolve this backend to support that option too, you wouldn't even >>> require XA when storing the entity in the same database! >>> - http://in.relation.to/2015/07/09/hibernate-search-jms-transaction/ >>> >> Yes, it's very nice feature and fit perfectly with relationaldb >>> >>> I'd be happy to help with this, feel free to share non-working and/or >>> intermediate experimental branches when having questions or just >>> stuck. >>> Please start by creating a JIRA, you can leave the target version >>> undefined: we'll merge it when it's ready. >>> >> For all the task you have listed can I create sub task to the JIRA, or how >> do you track tasks? >> >>> >>> Thanks, >>> Sanne >>> >>> >>> On 5 August 2015 at 20:05, Flemming Harms <flemming.ha...@gmail.com> >>> wrote: >>> > Hi Martin >>> > >>> > For this version the AbstractDatabaseHibernateSearchController is not >>> > able >>> > to process Lucene workers simultaneously. When we build it our initial >>> > requirement was only one node should process the workers at a time, but >>> > the >>> > “master” was floating. We use Quartz to get this type of functionality >>> > and >>> > it will synchronizing the execution between the nodes. But you could >>> > also >>> > use an HA-singleton to dedicate a specific node to process the workers. >>> > >>> > We had been playing with an idea where we stamp the LuceneDatabaseWork >>> > with >>> > the known cluster nodes, and then the last node will remove it from the >>> > database or a scheduled job can take care of it. The advance of this >>> > solution is it will make Infinispan optional, and it can store the >>> > indexes >>> > on each node instead in a shared cache. >>> > >>> > Your idea and work look very nice. Pretty awesome feature to support >>> > different JPA providers. >>> > >>> > -- >>> > cheers >>> > Flemming >>> > >>> > >>> > 2015-08-05 11:57 GMT+02:00 Martin Braun <martinbraun...@aol.com>: >>> > >>> >> Hi, >>> >> >>> >> >>> >> Note: I am no core developer of Hibernate Search, but I am currently >>> >> working on something >>> >> that looks quite similar to what you are doing :). One part of it is an >>> >> updating mechanism based on triggers >>> >> that uses the database as a event-storage as well. It's not the exact >>> >> same >>> >> thing, but related. >>> >> >>> >> >>> >> https://github.com/Hotware/Hibernate-Search-JPA >>> >> >>> >> >>> >> >>> >> The idea is quite nice, but after looking at the sourcecode I am >>> >> wondering >>> >> how the different nodes are able to work together, because in >>> >> AbstractDatabaseHibernateSearchController you remove the entity >>> >> from the persistence context and I wasn't able to find code that would >>> >> make up for that. >>> >> >>> >> >>> >> Doesn't that mean that the other workers will not be able to read that >>> >> entity? >>> >> Or will users of this need to implement their own synchronization >>> >> mechanism between >>> >> the different nodes? >>> >> >>> >> >>> >> Martin Braun >>> >> martinbraun...@aol.com >>> >> www.github.com/s4ke >>> >> >>> >> >>> >> >>> >> >>> >> -----Original Message----- >>> >> From: Flemming Harms <flemming.ha...@gmail.com> >>> >> To: Hibernate.org <hibernate-dev@lists.jboss.org> >>> >> Sent: Tue, Aug 4, 2015 6:40 pm >>> >> Subject: [hibernate-dev] [Hibernate Search] Database back end worker >>> >> >>> >> >>> >> Hey guys >>> >> >>> >> I want to introduce myself and a new database back-end worker, me >>> >> and >>> >> another guy have build for hibernate search. I already had some initial >>> >> talk >>> >> with Sanne regarding if this could be interested to the hibernate >>> >> search >>> >> project. >>> >> >>> >> I have been working with Hibernate Search from some time and actually >>> >> done >>> >> various small custom modification to search since 3.x, especial >>> >> around >>> >> running in a cluster and indexing. To make a long story short when >>> >> we >>> >> upgraded Hibernate search we thought it would be ideal to use a SQL >>> >> database >>> >> as storage for lucene workers for 3 main reasons. >>> >> >>> >> - The database was shared >>> >> between the nodes >>> >> - The workers was persistent in case of a node crash. >>> >> - No >>> >> master/slave >>> >> >>> >> >>> >> *In some way it’s very similar to the JMS back-end worker, where >>> >> the user >>> >> also have to implement a MDB that process the workers. In our case >>> >> they >>> >> will have to implement a job using something like quartz or a >>> >> timer >>> >> service. * >>> >> >>> >> *We are using JPA as persistence layer for the database, even >>> >> it’s a fairly >>> >> simple entity we persistent, but it make sense for supporting >>> >> various >>> >> databases and schema update out of the box. We have tried to make it’s >>> >> as >>> >> easy as possible to set-up by minimizing the number of properties, and >>> >> it’s >>> >> all configurable from the persistence.xml* >>> >> >>> >> *The actually work can* be >>> >> *find >>> >> here >>> >> https://github.com/umbrew/org.umbrew.hibernate.database.worker.backend >>> >> >>> >> <https://github.com/umbrew/org.umbrew.hibernate.database.worker.backend>* >>> >> >>> >> >>> >> >>> >> *So >>> >> based on this introduction and the code, is this something you could >>> >> use? (of >>> >> course with the modification it requires to follow the design, >>> >> style, docs etc >>> >> for the search)*-- >>> >> >>> >> Kind regards >>> >> Flemming >>> >> Harms >>> >> _______________________________________________ >>> >> hibernate-dev mailing >>> >> list >>> >> hibernate-dev@lists.jboss.org >>> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev >>> >> >>> >> >>> >> _______________________________________________ >>> >> hibernate-dev mailing list >>> >> hibernate-dev@lists.jboss.org >>> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev >>> > >>> > >>> > >>> > >>> > -- >>> > >>> > Kind regards / Med Venlig Hilsen >>> > Flemming Harms >>> > >>> > - >>> > >>> > https://twitter.com/fnharms >>> > https://dk.linkedin.com/in/fharms >>> > _______________________________________________ >>> > hibernate-dev mailing list >>> > hibernate-dev@lists.jboss.org >>> > https://lists.jboss.org/mailman/listinfo/hibernate-dev >> >> >> >> >> -- >> >> Kind regards / Med Venlig Hilsen >> Flemming Harms >> >> https://twitter.com/fnharms >> https://dk.linkedin.com/in/fharms >> > > _______________________________________________ > hibernate-dev mailing list > hibernate-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/hibernate-dev _______________________________________________ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev