Re: Solr: How to index range-pair fields?
Sorry Venkat, this is pushing beyond my immediate knowledge. You'd just need to experiment. But the document still looks a bit wrong, specifically I don't understand where those extra 366 values are coming from. It should be just a two-dimensional coordinates, first one for start of the range, second for the end. You seem to have 2 extra useless ones. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 21 August 2015 at 21:29, vaedama sudheer.u...@gmail.com wrote: Alexandre, Fantastic answer! I think having a start position would work nicely with my use-case :) Also I would prefer to do the date Math during indexing. *Question # 1:* Can you please tell me if this doc looks correct (given that I am not yet bothered about factoring in year into my use-case) ? Student X was `absent` between dates: Jan 1, 2015 and Jan 15, 2015 Feb 13, 2015 and Feb 16, 2015 (assuming that Feb 13 is 43rd day in the year 2015 and Feb 16 is 46th day) March 19, 2015 and March 25, 2015 Also X was `present` between dates: Jan 25, 2015 and Jan 30, 2015 Feb 1, 2015 and Feb 12, 2015 { id: X, state: [absent, present], presentDays: [ [01 15 366 366], [43, 46, 366, 366], [78, 84, 366, 366] ], absentDays: [ [25, 30, 366, 366], [32, 43, 366, 366] ] } *Question #2:* Since I need timestamp level granularity, what is the appropriate way to store the field ? Student X was `absent` between epoch times: 1420104600 (9:30 AM, Jan 1 2015) and 1421341200 (5:00 PM, Jan 15, 2015) Is it possible to change *worldBounds* to take a polygon structure where I can represent millisecond level granularity ? Thanks in advance, Venkat Sudheer Reddy Aedama -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-How-to-index-range-pair-fields-tp4224369p4224582.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr: How to index range-pair fields?
I can't find the discussion/presentation about it (about 2 years ago), but basically you can use LatLong geographic field to do this. You represent start date/time on X axis and end date/time on Y axes. Then, for search you intersect it with a rectangle of your desired check dates. Hopefully this is enough for you to go on. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 20 August 2015 at 21:14, vaedama sudheer.u...@gmail.com wrote: My scenario is something like this: I have a students database. I want to query all the students who were either `absent` or `present` during a particular `date-range`. For example: Student X was `absent` between dates: Jan 1, 2015 and Jan 15, 2015 Feb 13, 2015 and Feb 16, 2015 March 19, 2015 and March 25, 2015 Also X was `present` between dates: Jan 25, 2015 and Jan 30, 2015 Feb 1, 2015 and Feb 12, 2015 (Other days were either school holidays or the teacher was either lazy/forgot to take the attendance ;) If the date range was only a single-valued field then this approach would work: http://stackoverflow.com/questions/25246204/solr-query-for-documents-whose-from-to-date-range-contains-the-user-input. I have multiple-date ranges for each student, so this would not work for my use-case. Lucent 5.0 has support for `DateRangeField` (http://lucene.apache.org/solr/5_0_0/solr-core/index.html?org/apache/solr/schema/DateRangeField.html ) which is perfect for my use-case, but I cannot upgrade to 5.0 yet! I am on Lucene 4.1.0. David Smiley had mentioned that it would be ported to 4.x but I guess it never happened (https://issues.apache.org/jira/browse/SOLR-6103, I can try porting this patch my-self but I would like to know what it takes and opinions) So basically, I need to maintain relationship between the start and end dates for each of the `state`s (absence or presence). So I thought I would need to index the fields as pairs as mentioned here: http://grokbase.com/t/lucene/solr-user/128r96vwz6/how-do-i-represent-a-group-of-customer-key-value-pairs I guess my schema would look like: fieldType name=tdate class=solr.TrieDateField omitNorms=true precisionStep=6 positionIncrementGap=0/ field name=state type=string indexed=true stored=true multiValued=true/ dynamicField name=presenceStartTime_* type=tdate indexed=true stored=true/ dynamicField name=presenceEndTime_* type=tdate indexed=true stored=true/ dynamicField name=absenceStartTime_* type=tdate indexed=true stored=true/ dynamicField name=absenceEndTime_* type=tdate indexed=true stored=true/ **Question #1:** Does this look correct ? **Question #2:** What are the ramifications if I use `tlong` instead of `tdate` ? My `tlong` type looks like this: fieldType name=tlong class=solr.TrieLongField precisionStep=8 omitNorms=true positionIncrementGap=0/ **Question #3:** So in this case, for the query: get all the students who were absent between a date range would the query would look something similar to this ? (state: absent) AND (absenceStartTime1: givenLowerBoundDate) AND (absenceStartTime2: givenLowerBoundDate) AND (absenceStartTime3: givenLowerBoundDate) AND (absenceEndTime1: givenUpperBoundDate) AND (absenceEndTime2: givenUpperBoundDate) AND (absenceEndTime3: givenUpperBoundDate) This would work only if I knew that there were 3 dates in which the student was absent before hand and there's no way to query all dynamic fields with wild-cards according to http://stackoverflow.com/questions/6213184/solr-search-query-for-dynamic-fields-indexed **Question #4:** The workaround mentioned in one of the answers in that question did not look terrible but seemed a bit complicated. Is there a better alternative for solving this problem in Solr ? Of course, I would be highly interested in any other better approaches. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-How-to-index-range-pair-fields-tp4224369.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr: How to index range-pair fields?
Alexandre, How would the data type look like ? Currently, this is what I have: fieldType name=days_of_year class=solr.SpatialRecursivePrefixTreeFieldType geo=false worldBounds=0 0 366 366 distErrPct=0 maxDistErr=0.0009 units=degrees / field name=state type=string indexed=true stored=true multiValued=true field name=presentDays type=days_of_year indexed=true stored=true multiValued=true/ field name=absentDays type=days_of_year indexed=true stored=true multiValued=true”/ This is how I am indexing each record: for-each student: get the presence/absence period list for each presence/absence period get the state (either presence or absence) and add the value to *state* field inside the doc if the state is absence, add the absence period to the *absentDays* field if the state is presence, add the presence period to the *presentDays* field So, for student X (taken from my previous msg): Student X was `absent` between dates: Jan 1, 2015 and Jan 15, 2015 Feb 13, 2015 and Feb 16, 2015 March 19, 2015 and March 25, 2015 Also X was `present` between dates: Jan 25, 2015 and Jan 30, 2015 Feb 1, 2015 and Feb 12, 2015 This is how I think my student record would look like. Does it look correct ? { id: X, state: [absent, present] presentDays: [ [01 15 366 366], [13, 16, 366, 366], [19, 25, 366, 366] ] absentDays: [ [25, 30, 366, 366], [1, 12, 366, 366] ] } Also how would I represent the year in this case: Student Y was absent between Jan 1, *2012* to Feb 1, *2015* ? I would appreciate if you can provide an example of how to modify my fieldType definition to store timestamp level granularity. Thanks, Sudheer -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-How-to-index-range-pair-fields-tp4224369p4224526.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr: How to index range-pair fields?
On 21 August 2015 at 15:32, vaedama sudheer.u...@gmail.com wrote: presentDays: [ [01 15 366 366], [13, 16, 366, 366], [19, 25, 366, 366] ] This does not look right. Your January 1 2015 should map to a single number, representing 'X' in the coordinates. Your January 15 2015 should map to another number, representing Y in the coordinates. That's why the world bounds is 0-366 (366 being the maximum number of days in the year, ignoring specific year). So, if you ignore a year, January 1 is '1', January 15 is '15, February 1 is '32', etc. If you don't ignore year, you need to factor it in somehow, perhaps as a day offset from a particular start position, e.g. 2010. You would need to do some date Math during indexing, either in the client, or in the UpdateRequestProcessor Regards, Alex.
Re: Solr: How to index range-pair fields?
These look right. Then, you just play around with mapping. Your dates to coordinates could be as granular as you want as long as they fit into data type. And with this being school, your epochs might be smaller (e.g. semesters) and kept as a separate number. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 21 August 2015 at 13:57, vaedama sudheer.u...@gmail.com wrote: Hi Alexandre, Thanks for your reply! I guess these are the links that you were referring to :) http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201212.mbox/%3c1354991310424-4025359.p...@n3.nabble.com%3E https://wiki.apache.org/solr/SpatialForTimeDurations https://people.apache.org/~hossman/spatial-for-non-spatial-meetup-20130117/ That would work if the time ranges were days_of_year. But I want to also maintain the timestamp level granularity. Apologies for not mentioning that in my earlier email. Thanks, Venkat Sudheer Reddy Aedama -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-How-to-index-range-pair-fields-tp4224369p4224508.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr: How to index range-pair fields?
You can always index to a second field with date math, or even pull out the day as you're indexing. Best, Erick On Fri, Aug 21, 2015 at 10:57 AM, vaedama sudheer.u...@gmail.com wrote: Hi Alexandre, Thanks for your reply! I guess these are the links that you were referring to :) http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201212.mbox/%3c1354991310424-4025359.p...@n3.nabble.com%3E https://wiki.apache.org/solr/SpatialForTimeDurations https://people.apache.org/~hossman/spatial-for-non-spatial-meetup-20130117/ That would work if the time ranges were days_of_year. But I want to also maintain the timestamp level granularity. Apologies for not mentioning that in my earlier email. Thanks, Venkat Sudheer Reddy Aedama -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-How-to-index-range-pair-fields-tp4224369p4224508.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr: How to index range-pair fields?
Hello Eric, Thanks for your reply. You can always index to a second field with date math, or even pull out the day as you're indexing. How would this second field look like ? Can you please provide me an example for both fieldType definition and field definition ? Also, please tell me how would my query fit into this ? Specifically, my use-case is CONTAINS (I do not care about INTERSECT or WITHIN). Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-How-to-index-range-pair-fields-tp4224369p4224520.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr: How to index range-pair fields?
Hi Alexandre, Thanks for your reply! I guess these are the links that you were referring to :) http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201212.mbox/%3c1354991310424-4025359.p...@n3.nabble.com%3E https://wiki.apache.org/solr/SpatialForTimeDurations https://people.apache.org/~hossman/spatial-for-non-spatial-meetup-20130117/ That would work if the time ranges were days_of_year. But I want to also maintain the timestamp level granularity. Apologies for not mentioning that in my earlier email. Thanks, Venkat Sudheer Reddy Aedama -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-How-to-index-range-pair-fields-tp4224369p4224508.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr: How to index range-pair fields?
Alexandre, Fantastic answer! I think having a start position would work nicely with my use-case :) Also I would prefer to do the date Math during indexing. *Question # 1:* Can you please tell me if this doc looks correct (given that I am not yet bothered about factoring in year into my use-case) ? Student X was `absent` between dates: Jan 1, 2015 and Jan 15, 2015 Feb 13, 2015 and Feb 16, 2015 (assuming that Feb 13 is 43rd day in the year 2015 and Feb 16 is 46th day) March 19, 2015 and March 25, 2015 Also X was `present` between dates: Jan 25, 2015 and Jan 30, 2015 Feb 1, 2015 and Feb 12, 2015 { id: X, state: [absent, present], presentDays: [ [01 15 366 366], [43, 46, 366, 366], [78, 84, 366, 366] ], absentDays: [ [25, 30, 366, 366], [32, 43, 366, 366] ] } *Question #2:* Since I need timestamp level granularity, what is the appropriate way to store the field ? Student X was `absent` between epoch times: 1420104600 (9:30 AM, Jan 1 2015) and 1421341200 (5:00 PM, Jan 15, 2015) Is it possible to change *worldBounds* to take a polygon structure where I can represent millisecond level granularity ? Thanks in advance, Venkat Sudheer Reddy Aedama -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-How-to-index-range-pair-fields-tp4224369p4224582.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr: How to index range-pair fields?
My scenario is something like this: I have a students database. I want to query all the students who were either `absent` or `present` during a particular `date-range`. For example: Student X was `absent` between dates: Jan 1, 2015 and Jan 15, 2015 Feb 13, 2015 and Feb 16, 2015 March 19, 2015 and March 25, 2015 Also X was `present` between dates: Jan 25, 2015 and Jan 30, 2015 Feb 1, 2015 and Feb 12, 2015 (Other days were either school holidays or the teacher was either lazy/forgot to take the attendance ;) If the date range was only a single-valued field then this approach would work: http://stackoverflow.com/questions/25246204/solr-query-for-documents-whose-from-to-date-range-contains-the-user-input. I have multiple-date ranges for each student, so this would not work for my use-case. Lucent 5.0 has support for `DateRangeField` (http://lucene.apache.org/solr/5_0_0/solr-core/index.html?org/apache/solr/schema/DateRangeField.html ) which is perfect for my use-case, but I cannot upgrade to 5.0 yet! I am on Lucene 4.1.0. David Smiley had mentioned that it would be ported to 4.x but I guess it never happened (https://issues.apache.org/jira/browse/SOLR-6103, I can try porting this patch my-self but I would like to know what it takes and opinions) So basically, I need to maintain relationship between the start and end dates for each of the `state`s (absence or presence). So I thought I would need to index the fields as pairs as mentioned here: http://grokbase.com/t/lucene/solr-user/128r96vwz6/how-do-i-represent-a-group-of-customer-key-value-pairs I guess my schema would look like: fieldType name=tdate class=solr.TrieDateField omitNorms=true precisionStep=6 positionIncrementGap=0/ field name=state type=string indexed=true stored=true multiValued=true/ dynamicField name=presenceStartTime_* type=tdate indexed=true stored=true/ dynamicField name=presenceEndTime_* type=tdate indexed=true stored=true/ dynamicField name=absenceStartTime_* type=tdate indexed=true stored=true/ dynamicField name=absenceEndTime_* type=tdate indexed=true stored=true/ **Question #1:** Does this look correct ? **Question #2:** What are the ramifications if I use `tlong` instead of `tdate` ? My `tlong` type looks like this: fieldType name=tlong class=solr.TrieLongField precisionStep=8 omitNorms=true positionIncrementGap=0/ **Question #3:** So in this case, for the query: get all the students who were absent between a date range would the query would look something similar to this ? (state: absent) AND (absenceStartTime1: givenLowerBoundDate) AND (absenceStartTime2: givenLowerBoundDate) AND (absenceStartTime3: givenLowerBoundDate) AND (absenceEndTime1: givenUpperBoundDate) AND (absenceEndTime2: givenUpperBoundDate) AND (absenceEndTime3: givenUpperBoundDate) This would work only if I knew that there were 3 dates in which the student was absent before hand and there's no way to query all dynamic fields with wild-cards according to http://stackoverflow.com/questions/6213184/solr-search-query-for-dynamic-fields-indexed **Question #4:** The workaround mentioned in one of the answers in that question did not look terrible but seemed a bit complicated. Is there a better alternative for solving this problem in Solr ? Of course, I would be highly interested in any other better approaches. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-How-to-index-range-pair-fields-tp4224369.html Sent from the Solr - User mailing list archive at Nabble.com.