On 10 August 2015 at 11:04, Hardy Ferentschik <ha...@hibernate.org> wrote: > Hi, > > sorry, I am late to the game, but I here are some more thoughts on this. > > I think the consensus so far is that > > # Date/time types which represent an instant in time are treated as usual. > They can be string encoded (per default yyyyMMddHHmmssSSS) or numerically > in which case the numeric long value equals the epoch time of the > represented > date.
Correct that's the consensus so far. I'd like to challenge one more detail though: does it still make sense to allow string-encoded? I think not, we did allow it primarily because a long time ago that was the only way, then it became one of the options -but still the default - and more recently it became the non-default way. With these new types,backwards compatibility is a non-issue. So unless someone makes a strong case for needing these as String in the index, what about we drop some complexity? Remember: - Hibernate Search is not an Objects/index mapper so we're not aiming at creating any index schema possible, we're aiming at taking advantage of the index for practical purposes ("I want it to be a string in the index" is not a valid argument - use your own fieldbridge in case) - With Projections we have to re-transform things back into their Java original type, so how we encode things in the index is irrelevant from a semantics point of view; I think the only valid challenge would need to come from a performance or storage space perspective, in both cases I'm pretty sure the numeric encoding would win. > # Date/time types which do not represent an instant in time can also be > encoded as string or number, but in the latter case the numeric > representation > is given by interpreting the string representation as number. > > So far so good. There are a couple of more things to think about. > > # Query time gets interesting and I think we need to improve the DSL in unison > with adding support for these new types. Check out this example from > DSLTest [1] > > query = monthQb > .range() > .onField( "estimatedCreation" ) > .ignoreFieldBridge() > .andField( "justfortest" ) > > .ignoreFieldBridge().ignoreAnalyzer() > .from( DateTools.round( from, > DateTools.Resolution.MINUTE ) ) > .to( DateTools.round( to, > DateTools.Resolution.MINUTE ) ) > .excludeLimit() > .createQuery(); > > If a date is numerically encoded you need to specify numbers for the from and > to values. ATM, > we recommend to use the Lucene specific DateTools to get the numeric > representation. With the support > ofthe new date types things will get confusing for the user. How does one > "create" the numeric representation > of a LocalDate (and how does one know how it looks like in the first place > and how it differs from the epoch time)? Great point, we should accept the user's domain type exclusively and take the conversion burden from the user; especially since we know the correct conversion strategy. > We have been discussing before whether Hibernate Search needs to offer its > own version of DateTools. > I think it would be time to do so and include helpers for the new date/time > types. This also reduces the exposure > to Lucene specific types. +1 to encapsulate it, but I don't expect people to need it at all in the above case? But good for other more advanced needs. > > Even better though would be, if we would be able to support directly the use > of date types in the from and to clauses. > It would be the responsibility of the DSL to round the specified types to the > appropriate level based on the field's > configuration/metadata. Even in this scenario though a Search specific > DateTools might be necessary for the cases > where the date specified in to/from needs to be rounded differently than the > field itself. +1 > Last but not least, the documentation needs to be updated. At the moment, the > docs are silent about all the complexity > around dates. With the support of the new types, the docs needs to be more > explicit and describe the subtleties at play. +1 created HSEARCH-1958 Thanks, Sanne > > --Hardy > > > On Wed, Aug 05, 2015 at 05:40:16PM +0100, Sanne Grinovero wrote: >> On 5 August 2015 at 17:22, Davide D'Alto <dav...@hibernate.org> wrote: >> >> Proposal: use numeric but still - rather than taking the milliseconds >> >> from epoch, take the resulting number from YYYYMMDD ? >> > >> > I don't think I understand what you mean with "the resulting number from >> > YYYYMMDD". >> > Wouldn't be similar to get the number of days from epoch? >> >> No because epoch is a specific moment *with a timezone*. If you take a >> calendar date "here", and take the moment in time which represents >> your beginning of the calendar date, the distance from epoch is not a >> whole number and you'd have to apply rounding which is timezone >> specific. >> >> By simply encoding the number in the above format, you'd encode today >> as the number "20150805". >> That's a whole number which avoids the timezone relativity and can be >> efficiently encoded in numeric form, and provides the expected sorting >> properties. >> >> > >> > But basically, you are saying that I can use different numeric encoding for >> > different types. Isn't it? >> >> Yes, you definitely need different encodings depending on the type and >> the used options. >> >> > So, for example: >> > >> > java.util.Date, java.util.Calendar and java.time.Instant, >> > java.time.LocalDateTime will use number of miliseconds from epoch >> > java.time.LocalDate: number of days from epoch >> >> Except this one ^ I agree with the others. >> >> > java.time.LocalTime: number of nanos in a day >> >> Conceptually, yes.. but we don't have "nanoseconds" as an option of >> org.hibernate.search.annotations.Resolution. Should we add it? >> We would not be able to apply that Resolution on old fashioned >> Date/Calendar, so that would need a warning or even an exception when >> applied to old style value types. >> >> >> Ok that works but why write all those zeros in the index, when you can >> >> just write the date. I realize storage is cheap, but still we need to >> >> be careful as the index size affects performance ;-) >> > >> > I don't think we need to store the 0s. >> > If I know the type of the field I already know the the time is 0. >> >> Exactly >> >> > Am I missing something? >> >> I probably just misunderstood your proposal, since previously you >> mentioned: "I would just consider a LocalDate the same as a >> LocalDateTime with time 00:00:000 (UTC time zone)". >> If you have to write the days only you don't need to convert to a time first. >> This misunderstanding might be related with the fact that you were >> planning to encode as distance from epoch.. see my first comment on >> this same email. >> Since you don't want to look at distance from epoch for this case, the >> time component really is irrelevant and LocalDate has all the >> information you need.. simpler ;) >> >> Sanne >> >> >> > >> > >> > On Wed, Aug 5, 2015 at 5:00 PM, Sanne Grinovero <sa...@hibernate.org> >> > wrote: >> > >> >> On 5 August 2015 at 16:27, Gunnar Morling <gun...@hibernate.org> wrote: >> >> >> as I'd like us to consider not >> >> > applying DateBridge on the new types as it doesn't seem to add much >> >> > practical value. >> >> > >> >> > Ok, that may make sense for types such as LocalDate. But there are types >> >> in >> >> > the new API which - unlike LocalDate - do describe an exact instant on >> >> the >> >> > time line (e.g. ZonedDateTime, Instant). For those IMO it makes sense >> >> > for >> >> > sure to support both encodings, NUMERIC and STRING (similar to >> >> Date/Calendar >> >> > so far) and thus apply @DateBridge. >> >> >> >> +1 >> >> >> >> > Question is whether/how to index/persist TZ information, for Calendar it >> >> > seems not been persisted in the index so far? >> >> >> >> It's encoding the Calendar's time as distance from epoch, which is a >> >> neutral encoding so you don't need the TZ. >> >> >> >> For the old style Date/Calendar types we always assumed the value was >> >> a point-in-time, unless explicitly opting in for an alternative >> >> encoding. >> >> For example for the "birthday use case" a reasonable setting would >> >> have been String encoding with resolution=DAY, although passing in a >> >> Date instance having the right value (as in right timezone) would have >> >> been user's responsibility.. we simply take the long it's storing and >> >> index that with the requested resolution. >> >> >> >> Sanne >> >> >> >> > >> >> > >> >> > 2015-08-05 17:10 GMT+02:00 Sanne Grinovero <sa...@hibernate.org>: >> >> >> >> >> >> Inline: >> >> >> >> >> >> On 5 August 2015 at 15:42, Davide D'Alto <dav...@hibernate.org> wrote: >> >> >> > If a user select a resolution that does not make much sense we can >> >> log a >> >> >> > warning. >> >> >> >> >> >> +1 And update the javadoc to mention that some resolution values don't >> >> >> apply >> >> >> >> >> >> > But I think this might make sense: >> >> >> > >> >> >> > @DateBridge(resolution=MONTH) >> >> >> > LocalDate birthday; >> >> >> >> >> >> Ok but how often do you think that will be used? >> >> >> Sorry playing devil's advocate here, as I'd like us to consider not >> >> >> applying DateBridge on the new types as it doesn't seem to add much >> >> >> practical value. >> >> >> >> >> >> I agree it's worth a shot, but while going ahead keep in mind that >> >> >> maybe simplifying that is the more elegant solution. >> >> >> >> >> >> > On Wed, Aug 5, 2015 at 3:37 PM, Davide D'Alto <dav...@hibernate.org> >> >> >> > wrote: >> >> >> > >> >> >> >> > What would you do though in case of the following: >> >> >> >> > >> >> >> >> > @DateBridge >> >> >> >> > LocalDate myDate; >> >> >> >> > >> >> >> >> > encoding() defaults to NUMERIC, so would you a) raise an error, or >> >> b) >> >> >> >> ignore encoding() for LocalDate and friends? Both seem not right to >> >> me. >> >> >> >> I >> >> >> >> think there is nothing wrong with using NUMERIC encoding per-se for >> >> >> >> these >> >> >> >> types. We may recommend STRING but if NUMERIC really is what a user >> >> >> >> wants I >> >> >> >> would let them do so. >> >> >> >> >> >> I'm all for letting the users have the last word, but this is one of >> >> >> those cases in which you don't know if they explicitly want that or >> >> >> simply went with the defaults. >> >> >> >> >> >> Not a big problem as of course the important thing of defaults is that >> >> >> "they work" but I'd really prefer the default to try be the most >> >> >> appropriate encoding, which is not numeric in this case. >> >> >> >> >> >> Proposal: use numeric but still - rather than taking the milliseconds >> >> >> from epoch, take the resulting number from YYYYMMDD ? It might even be >> >> >> the most efficient encoding, as you don't have the drawback of >> >> >> clustering which we would have with a numeric encoding working on the >> >> >> individual fields, and doesn't have the bloat of string encoding. >> >> >> >> >> >> >> >> >> >> >> +1 >> >> >> >> >> >> >> >> > What do you suggest we do if a user maps the following? >> >> >> >> >> >> >> >> > @DateBridge(resolution=MILLISECOND) >> >> >> >> > LocalDate birthday; >> >> >> >> >> >> >> >> >> >> >> >> Nothing really, >> >> >> >> I would just consider a LocalDate the same as a LocalDateTime with >> >> time >> >> >> >> 00:00:000 (UTC time zone) >> >> >> >> >> >> Ok that works but why write all those zeros in the index, when you can >> >> >> just write the date. I realize storage is cheap, but still we need to >> >> >> be careful as the index size affects performance ;-) >> >> >> >> >> >> Sanne >> >> >> >> >> >> >> >> >> >> >> It is equivalent to: >> >> >> >> LocalDateTime dateTime = date.atStartOfDay( ZoneOffset.UTC ); >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Wed, Aug 5, 2015 at 3:24 PM, Gunnar Morling <gun...@hibernate.org >> >> > >> >> >> >> wrote: >> >> >> >> >> >> >> >>> >> >> >> >>> >> >> >> >>> 2015-08-05 12:41 GMT+02:00 Sanne Grinovero <sa...@hibernate.org>: >> >> >> >>> >> >> >> >>>> Our current implementation converts Date in the long "distance >> >> >> >>>> from >> >> >> >>>> epoch" to allow correct range-queries treating each Date as an >> >> >> >>>> instant >> >> >> >>>> in time - allowing a universal sorting strategy. But a LocalDate >> >> >> >>>> is >> >> >> >>>> not an instant-in-time. >> >> >> >>>> >> >> >> >>>> A LocalDate is intentionally oblivious of the timezone; as the >> >> >> >>>> javadoc >> >> >> >>>> states, it's useful for birthdays, i.e. symbolic occurrences and >> >> >> >>>> potentially legal matters which don't fit into a universal sorting >> >> >> >>>> model but rather with the local political scene - we would need >> >> >> >>>> the >> >> >> >>>> combo {LocalDate, ZoneId} provided to be able to allow sorting >> >> across >> >> >> >>>> different LocalDate - or simply assume that they are all referring >> >> to >> >> >> >>>> the same Zone. >> >> >> >>>> >> >> >> >>> >> >> >> >>> Right, I had the latter in mind and would use UTC for that purpose. >> >> >> >>> >> >> >> >>>> >> >> >> >>>> I think that if the user is using a LocalDate type, he's >> >> >> >>>> implicitly >> >> >> >>>> hinting that the timezone is not relevant for the practical use >> >> >> >>>> (possibly even wrong); the most faithful representation would be >> >> the >> >> >> >>>> string form in ISO standard format or to encode the day,month,year >> >> as >> >> >> >>>> independent fields? This last detail depends on how it would be >> >> more >> >> >> >>>> efficient to store & query; probably the String format YYYYMMDD >> >> would >> >> >> >>>> be the most efficient internal representation to allow also >> >> >> >>>> correct >> >> >> >>>> sorting. >> >> >> >>>> >> >> >> >>>> I wouldn't use NumericField(s) in this case, as they are more >> >> >> >>>> effective only with larger ranges, while MM and DD are very short; >> >> >> >>>> not >> >> >> >>>> sure if it's worth splitting the year as a NumericField either, as >> >> >> >>>> the >> >> >> >>>> values will likely be strongly clustered in the same range of >> >> "recent >> >> >> >>>> years" - although that might depend on the application but it >> >> doesn't >> >> >> >>>> seem worth the complexity, so I'd index & store as a String >> >> YYYYMMDD. >> >> >> >>>> >> >> >> >>> >> >> >> >>> Agreed that this makes most sense, given the "symbolic" nature of >> >> >> >>> LocalDate. >> >> >> >>> >> >> >> >>> What would you do though in case of the following: >> >> >> >>> >> >> >> >>> @DateBridge >> >> >> >>> LocalDate myDate; >> >> >> >>> >> >> >> >>> encoding() defaults to NUMERIC, so would you a) raise an error, or >> >> b) >> >> >> >>> ignore encoding() for LocalDate and friends? Both seem not right to >> >> >> >>> me. I >> >> >> >>> think there is nothing wrong with using NUMERIC encoding per-se for >> >> >> >>> these >> >> >> >>> types. We may recommend STRING but if NUMERIC really is what a user >> >> >> >>> wants I >> >> >> >>> would let them do so. >> >> >> >>> >> >> >> >>>> >> >> >> >>>> -- Sanne >> >> >> >>>> >> >> >> >>>> >> >> >> >>>> On 5 August 2015 at 11:10, Gunnar Morling <gun...@hibernate.org> >> >> >> >>>> wrote: >> >> >> >>>> > Hi, >> >> >> >>>> > >> >> >> >>>> > What's the motivation for using a different representation in >> >> that >> >> >> >>>> case? >> >> >> >>>> > >> >> >> >>>> > For the sake of consistency, I'd use milli seconds since >> >> 1970-01-01 >> >> >> >>>> across >> >> >> >>>> > the board. Otherwise it'll be more difficult to compare fields >> >> >> >>>> > created >> >> >> >>>> from >> >> >> >>>> > properties of different date types. >> >> >> >>>> > >> >> >> >>>> > --Gunnar >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> > 2015-08-04 19:49 GMT+02:00 Davide D'Alto <dav...@hibernate.org>: >> >> >> >>>> > >> >> >> >>>> >> Hi, >> >> >> >>>> >> I started to work on the creation of the bridges for the >> >> >> >>>> >> classes >> >> >> >>>> >> in >> >> >> >>>> the >> >> >> >>>> >> java.time package. >> >> >> >>>> >> >> >> >> >>>> >> I was wondering if we want to convert the values to long using >> >> the >> >> >> >>>> existing >> >> >> >>>> >> approach we have now for java.util.Date. >> >> >> >>>> >> >> >> >> >>>> >> In Hibernate Search a java.util.Date is converted into a long >> >> that >> >> >> >>>> >> represents the number of milliseconds since January 1, 1970, >> >> >> >>>> >> 00:00:00 >> >> >> >>>> GMT >> >> >> >>>> >> using getTime(). >> >> >> >>>> >> >> >> >> >>>> >> The same value can be obtain from a java.time.LocaDate via: >> >> >> >>>> >> >> >> >> >>>> >> long epochMilli = date.atStartOfDay( ZoneOffset.UTC >> >> >> >>>> >> ).toInstant().toEpochMilli(); >> >> >> >>>> >> >> >> >> >>>> >> LocalDate has a method that returns the same value expressed in >> >> >> >>>> number of >> >> >> >>>> >> days: >> >> >> >>>> >> >> >> >> >>>> >> long epochDay = date.toEpochDay(); >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >>>> >> I would use the second approach >> >> >> >>>> >> >> >> >> >>>> >> Davide >> >> >> >>>> >> _______________________________________________ >> >> >> >>>> >> hibernate-dev mailing list >> >> >> >>>> >> hibernate-dev@lists.jboss.org >> >> >> >>>> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev >> >> >> >>>> >> >> >> >> >>>> > _______________________________________________ >> >> >> >>>> > hibernate-dev mailing list >> >> >> >>>> > hibernate-dev@lists.jboss.org >> >> >> >>>> > https://lists.jboss.org/mailman/listinfo/hibernate-dev >> >> >> >>>> >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> > _______________________________________________ >> >> >> > hibernate-dev mailing list >> >> >> > hibernate-dev@lists.jboss.org >> >> >> > https://lists.jboss.org/mailman/listinfo/hibernate-dev >> >> >> _______________________________________________ >> >> >> hibernate-dev mailing list >> >> >> hibernate-dev@lists.jboss.org >> >> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev >> >> > >> >> > >> >> _______________________________________________ >> >> hibernate-dev mailing list >> >> hibernate-dev@lists.jboss.org >> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev >> >> >> > _______________________________________________ >> > hibernate-dev mailing list >> > hibernate-dev@lists.jboss.org >> > https://lists.jboss.org/mailman/listinfo/hibernate-dev >> _______________________________________________ >> hibernate-dev mailing list >> hibernate-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/hibernate-dev _______________________________________________ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev