Re: index new discovered fileds of different types

2017-07-10 Thread Jan Høydahl
I think Thaer’s answer clarify how they do it. So at the time they assemble the full Solr doc to index, there may be a new field name not known in advance, but to my understanding the RDF source contains information on the type (else they could not do the mapping to dynamic field either) and so

Re: index new discovered fileds of different types

2017-07-10 Thread Thaer Sammar
Hi Rick, yes the RDF structure has subject, predicate and object. The object data type is not only text, it can be integer or double as well or other data types. The structure of our solar document doesn't only contain these three fields. We compose one document per subject and we use all found

Re: index new discovered fileds of different types

2017-07-09 Thread Rick Leir
Jan I hope this is not off-topic, but I am curious: if you do not use the three fields, subject, predicate, and object for indexing RDF then what is your algorithm? Maybe document nesting is appropriate for this? cheers -- Rick On 2017-07-09 05:52 PM, Jan Høydahl wrote: Hi, I have

Re: index new discovered fileds of different types

2017-07-09 Thread Jan Høydahl
Hi, I have personally written a Python script to parse RDF files into an in-memory graph structure and then pull data from that structure to index to Solr. I.e. you may perfectly well have RDF (nt, turtle, whatever) as source but index sub structures in very specific ways. Anyway, as Erick

Re: index new discovered fileds of different types

2017-07-07 Thread Rick Leir
Thaer Whoa, hold everything! You said RDF, meaning resource description framework? If so, you have exactly​ three fields: subject, predicate, and object. Maybe they are text type, or for exact matches you might want string fields. Add an ID field, which could be automatically generated by Solr,

Re: index new discovered fileds of different types

2017-07-07 Thread Erick Erickson
I'd recommend "managed schema" rather than schemaless. They're related but distinct. The problem is that schemaless makes assumptions based on the first field it finds. So if it finds a field with a "1" in it, it guesses "int". That'll break if the next doc has a 1.0 since it doesn't parse to an

Re: index new discovered fileds of different types

2017-07-07 Thread Thaer Sammar
Hi Jan, Thanks!, I am exploring the schemaless option based on Furkan suggestion. I need the the flexibility because not all fields are known. We get the data from RDF database (which changes continuously). To be more specific, we have a database and all changes on it are sent to a kafka queue.

Re: index new discovered fileds of different types

2017-07-07 Thread Jan Høydahl
If you do not need the flexibility of dynamic fields, don’t use them. Sounds to me that you really want a field “price” to be float and a field “birthdate” to be of type date etc. If so, simply create your schema (either manually, through Schema API or using schemaless) up front and index each

Re: index new discovered fileds of different types

2017-07-05 Thread Erick Erickson
I really have no idea what "to ignore the prefix and check of the type" means. When? How? Can you give an example of inputs and outputs? You might want to review: https://wiki.apache.org/solr/UsingMailingLists And to add to what Furkan mentioned, in addition to schemaless you can use "managed

Re: index new discovered fileds of different types

2017-07-05 Thread Thaer Sammar
Hi Furkan, No, In the schema we also defined some static fields such as uri and geo field. On 5 July 2017 at 17:07, Furkan KAMACI wrote: > Hi Thaer, > > Do you use schemeless mode [1] ? > > Kind Regards, > Furkan KAMACI > > [1]

Re: index new discovered fileds of different types

2017-07-05 Thread Furkan KAMACI
Hi Thaer, Do you use schemeless mode [1] ? Kind Regards, Furkan KAMACI [1] https://cwiki.apache.org/confluence/display/solr/Schemaless+Mode On Wed, Jul 5, 2017 at 4:23 PM, Thaer Sammar wrote: > Hi, > We are trying to index documents of different types. Document have >

index new discovered fileds of different types

2017-07-05 Thread Thaer Sammar
Hi, We are trying to index documents of different types. Document have different fields. fields are known at indexing time. We run a query on a database and we index what comes using query variables as field names in solr. Our current solution: we use dynamic fields with prefix, for example