https://lucidworks.com/2016/02/13/solrs-daterangefield-perform/

On Thu, Mar 1, 2018 at 1:22 PM, Andy Seaborne <[email protected]> wrote:
>
>
> On 28/02/18 17:53, Marco Neumann wrote:
>>
>> thank you, it's less than I hoped for
>
>
> Concrete example?
>
>
>
>> but certainly more than what I
>> can ask for Andy :)
>>
>> In short I'd like to get the xsd:dateTime scan out of the sparql
>> filter and perform a more efficient range via a date index similar to
>> the jena spatial implementation.
>>
>> I am going to take a look at DateRangeField  and see how it performs
>> relative to a standard sparql filter range query.
>>
>> best,
>> Marco
>>
>>
>> On Tue, Feb 27, 2018 at 5:21 PM, Andy Seaborne <[email protected]> wrote:
>>>
>>>
>>> On 27/02/18 11:41, Marco Neumann wrote:
>>>>
>>>>
>>>> Hi Andy, (I presume you wrote the following below) could you please
>>>> elaborate on the significance of this contribution in TDB?
>>>
>>>
>>>
>>> Hi Marco,
>>>
>>> For certain XSD datatypes, the value is stored in the NodeId (64 bits,
>>> minus
>>> the datatype indicator - 56 bits for TDB1, up to 62 bits for TDB2 for
>>> xsd:doubles) itself. It is faster to get the node back out the database.
>>>
>>> If value does not fit in the bits available, the long form is used.  In
>>> the
>>> long form, the NodeId is a pointer into the node table and the node is
>>> stoted as the lexical form+datatype (TDB1: in text; TDB2 in binary / RDF
>>> Thrift). This applies to strings and URIs.
>>>
>>>>
>>>> "The xsd:dateTime and xsd:date ranges cover about 8000 years from year
>>>> zero with a precision down to 1 millisecond. Timezone information is
>>>> retained to an accuracy of 15 minutes with special timezones for Z and
>>>> for no explicit timezone."
>>>
>>>
>>>
>>> That's the limit for xsd:dataTime in 56 bits.
>>>
>>>>
>>>>
>>>> https://jena.apache.org/documentation/tdb/architecture.html#inline-values
>>>>
>>>> does this give us enhanced temporal access methods via TDB that are
>>>> exposed as property functions in SPARQL?
>>>
>>>
>>>
>>> What exactly are you looking for here? Range queries or a database you
>>> can
>>> view at a point in time? ("Temporal database" can mean either.)
>>>
>>> You get the same SPARQL file capabilities but the inline form is faster
>>> (measurable and by quite a lot) because it does not go to the node table.
>>> Despite caching of the node table, it is still faster to get nodes out of
>>> the DB form the inline form (and I'd like to go faster still).
>>>
>>> Point-on-database.
>>>
>>> Not possible in TDB1.
>>> Possible (but not exposed) in TDB2.  TDB2 never forgets!
>>>
>>>> In particular I'd be interested in range queries on xsd:dateTime  here
>>>> and the possible  use of DateRangeField (SOLR) along jena-spatial.
>>>
>>>
>>>
>>> Range queries - it would be possible to start in the right place for a
>>> range
>>> scan because the values are in sorted order under this design.
>>>
>>> Insert complexity for the different datatypes possible - it might need a
>>> "this is a value centric database" flag so e.g. integers, whether
>>> xsd:short
>>> or xsd:??? are stored as binary integers loosing the datatype.
>>>
>>> In TDB1, that's true, TDB2 does keep the original datatype. Both are
>>> valid
>>> choices to different use cases.
>>>
>>> Hope that answers your questions,
>>>
>>>      Andy
>>>
>>>>
>>>>
>>>> Best,
>>>> Marco
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>



-- 


---
Marco Neumann
KONA

Reply via email to