Re: [hibernate-dev] Re: [Hibernate-JIRA] Commented: (HSEARCH-115) Add a default value for indexing null value
On Tue, 22 Apr 2008 18:47:14 +0200, Sanne Grinovero [EMAIL PROTECTED] wrote: Yes right but I don't see how this is worse than searching for foo:bar OR foo:NULL-KEYWORD, just some less ambiguity. If you just want to search for null fields, fooIsNull:true It is not much worse, but as less intuative. One has to know that if @IndexNullMarker is added another field fooIsNull is added. It's less intuative. On the other hand would actually work without ambiguities. I guess one would have to RTFM. Interesting I wasn't expecting the index to grow as I remove a Field and replace it with another; I've made a test for this: on 10,000,000 docs having 50% a random text value (chosen from 800 constants to limit total string tokens) and 50% nulls the index size grows by 3.5% compared to no null values (same docs and 800 consts). I wasn't expecting any growth above some bytes, anyway I think 3.5% is quite good. If you just add one of the fields (foo or fooIsNull) at the time we are fine. It could be more of an issue if we have always a fooIsNull:false for consistency as you mentioned as well. Nevertheless, I think your idea is still better than the straight forward approach. It just comes with more complexity usage wise. --Hardy ___ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev
Re: [hibernate-dev] Re: [Hibernate-JIRA] Commented: (HSEARCH-115) Add a default value for indexing null value
Hi again :) One more thing comes to my mind: On Tue, 22 Apr 2008 18:47:14 +0200, Sanne Grinovero [EMAIL PROTECTED] wrote: The Field and StringBridge API would remain as-is; I am not so sure about that. Looking at the DocumentBuilder code it would make sense to let the FieldBridge handle the null marker. DocumentBuilder just iterates over the members of the entity and for each with @Field annotated member calls the reponsible FieldBridge. It also passes along additonal annotation values like the boost. It would make sense to handle the @IndexNullMarker the same way, right? --Hardy ___ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev
Re: [hibernate-dev] Re: [Hibernate-JIRA] Commented: (HSEARCH-115) Add a default value for indexing null value
Hi, hoping the rest of list users won't hate me: IMHO I think it would be preferable to handle this in the DocumentBuilder, so that people won't have to repeat this complexity in custom FieldBridge implementations, and you don't have to update all built-in FieldBridges. We could make a wrapping FieldBridge that adds this functionality, but the special null marker Field is actually a constant and could be added to the metaproperties at DocumentBuilder initialization; also when building the Document you could test first for null and avoid passing all other options to the FieldBridge, just adding the constant Field to the document and skipping all further processing for the current property. We could skip this same processing also for properties which don't use the new feature, effectively speeding it up a bit, so we don't feel guilty for complicating the DocumentBuilder but are actually doing some optimization and code cleanup. Don't know if that is a dangerous optimization for backwards compatibility... I expect no FieldBridge to generate Fields on a null value but someone could rely on it..? Also I'm not sure about adding @IndexNullMarker or adding an option to @Field, what do others think about this? Emmanuel said: If we go that path, we should add a NullQuery class that can be combined with other *Query from Lucene and hide the complexity. This looks brilliant, but how should we instantiate a new NullQuery? If an @IndexNullMarker option could override the used fieldname we need a factory class with a reference to the DocumentBuilder or some other way to discover the special field name, so the simpleas way is to not have an option to choose the keyword-fieldname and define it only from the property name. Would it be acceptable that the user can't override the fieldname? regards, Sanne 2008/4/23, Hardy Ferentschik [EMAIL PROTECTED]: Hi again :) One more thing comes to my mind: On Tue, 22 Apr 2008 18:47:14 +0200, Sanne Grinovero [EMAIL PROTECTED] wrote: The Field and StringBridge API would remain as-is; I am not so sure about that. Looking at the DocumentBuilder code it would make sense to let the FieldBridge handle the null marker. DocumentBuilder just iterates over the members of the entity and for each with @Field annotated member calls the reponsible FieldBridge. It also passes along additonal annotation values like the boost. It would make sense to handle the @IndexNullMarker the same way, right? --Hardy ___ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev
Re: [hibernate-dev] Re: [Hibernate-JIRA] Commented: (HSEARCH-115) Add a default value for indexing null value
On Tue, 22 Apr 2008 03:29:18 +0200, Sanne Grinovero [EMAIL PROTECTED] wrote: A new proposal: I got inspired by the 3VL considerations described in Emmanuel's link to wikipedia, and think backwards compatibility is nice: add a @IndexNullMarker on the property, this will add an additional Field to the index for null values: Hmm, interesting idea. It addresses one of the biggest concerns I have with this null marker thing, namely ambiguities. But would querying look like in this case. Wouldn't it become harder? Whenever you want to use this feature you would have to combine two fields - foo and fooIsFalse - within a boolean query to get the expected result. Something like this: foo:bar OR fooIsNull:true. Of course it would also mean that the index size grows since we are adding more fields. And the bigger the index, ... The Field and StringBridge API would remain as-is; If you prefer not to add an additional @IndexNullMarker could be dropped if you think adding this field is acceptable for all fields. I think it should stay an optional and explicit feature. Adding one addtional field for each indexed properties does not seem justified. Especially, since we agree that the best solution would be to re-think your design and come up with a proper non-null default. So by offering this feature we might end up encouraging people to stick with there less optimal design ;-) Cheers, Hardy ___ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev
[hibernate-dev] Re: [Hibernate-JIRA] Commented: (HSEARCH-115) Add a default value for indexing null value
Hey The more I think about the feature, the less I like it. Here is what I have written in Hibernate Search in Action Hibernate Search, by default, does not store null attributes into the index. Lucene does not have the notion of null fields, the field is simply not there. Hibernate Search could offer the ability (and most likely will in the future) to use a special string as a null marker to still be able to search by null. But before you jump at the Hibernate Search team throat, you need to understand why they have not offered this feature so far. Null is not a value per se. Null means that the data is not known (or does not make sense). Therefore, searching by null as if it was a value is somewhat odd. The authors are well aware that this is a raging debate especially amongst the relational model experts (see http://en.wikipedia.org/wiki/Null_%28SQL%29) . Whenever you feel the need for searching by null, you should ask yourself if storing a special marker value in the database would make more sense. If you store a special marker value in the database, a lot of the null inconsistencies vanish. It also has the side effect of being queriable in Lucene and Hibernate Search. So before we jump on the boat for this feature, I would like to know if people think it's still a good idea to offer this feature. To answer your questions, the reason why I do not pass @Field but the raw set of data is because the @Field.index is translated into it's Lucene representation: some work is done. Most people will write StringBridge implementation anyway where the null handling will be taken care of transparently (by String2FieldBridgeAdaptor). I think I like 1 or 3. Note that get should be changed as well. Three is interesting indeed, rename it IndexingStragegy. On Apr 21, 2008, at 10:07, Hardy Ferentschik wrote: Hi Emmanuel, what's you take on this? Just adding another String parameter will work, but are we not getting too many parameters into the method? Wouldn't it be nicer to pass the actual @Field annotation. I think this might make things also clearer for the implementor of the interface. I am also trying here to get a little into your head to understand your ideas behind the code design - hope you don't mind ;-) --Hardy --- Forwarded message --- From: Hardy Ferentschik (JIRA) [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: [Hibernate-JIRA] Commented: (HSEARCH-115) Add a default value for indexing null value Date: Mon, 21 Apr 2008 14:04:33 +0200 [ http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_30032 ] Hardy Ferentschik commented on HSEARCH-115: --- Ok, here are a few suggestions: 1. This is the simplest way. Basically just add a new property named 'indexNullAs' to @Field and @ClassBridge. Accordingly extend the FieldBridge interface to set(String name, Object value, Document document, Field.Store store, Field.Index index, Field.TermVector termVector, Float boost, String indexNullAs). 2. Alternatively one could change the FieldBridge API to actually pass in the Field annotation itself: set(String name, Object value, Document document, Field fieldAnnotation, Float boost). This would reduce the amount of parameters and might actually be more transparent for users implementing custom bridges. Unfortunately, one would have to introduce a ClassBridge interface as well in this case. I am not sure whether it is a good design choice to pass annotation instances around. 3. We ccould also change the API into something like this: set(String name, Object value, Document document, IndexProperties props), where IndexProperties is just a wrapper class for Field.Store, Field.Index, ... The drawback is that this just increases the number of classes. Any comments? Add a default value for indexing null value --- Key: HSEARCH-115 URL: http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-115 Project: Hibernate Search Issue Type: Improvement Components: mapping Reporter: Julien Brulin Assignee: Hardy Ferentschik Fix For: 3.1.0 Hi, Null elements are not indexed by lucene then it's not easy to use a nullable property in lucene query. I have a TagTranslation entity in my model with a nullable property language. In this case null is used as default language for tag translation. Each translation may have many variations like synonyms. Because I can specified a default value for null value in the @Field annotation like this @Field(index=Index.UN_TOKENIZED, store=Store.NO, default='null'), i can't search a cat tag with a default translation like this : +value:cat* +lang:null pre/code @Entity() @Table(name=indexing_tag_trans)