Marius,

I would suggest to generate the schema and config, reloading every time there's 
a class change.
Alternatively, and that's how solr-drupal works, you would define the fields by 
prefix but I am not sure the aliassing would work.

I believe that the query-expansion step, from title:x to title-en:x title-ft:x, 
etc… is best to be controlled early so that applications can change that 
somehow. In curriki, this is done with a custom query-component which uses the 
query-parser (with a default-field which does not exist) then rewrites the 
query objects (which is a fairly easy game).

Hope it helps.




Le 14 nov. 2013 à 17:28, Marius Dumitru Florea <[email protected]> 
a écrit :

> On Wed, Nov 13, 2013 at 8:08 PM, Ludovic Dubost <[email protected]> wrote:
>> Hi Marius,
>> 
>> I have a quick question when starting reading your proposal. I don't see
>> anything about multi language indexing.
>> I remember in the current SOLR implementation that there are multiple
>> fields for each language. Would there be a fields for each language indexed
>> for each property ?
> 
> Yes. Right now I'm struggling to find a way to define an alias for a
> group of dynamic fields. For document title we have this in
> solrconfig.xml
> 
> <str name="f.title.qf">title__ title_ar title_bg title_ca ...</str>
> 
> which makes 'title' an alias for all its translations and allows us to
> write title:text in the search query. I need to do the same, but
> dynamically, for each object property:
> 
> property_Blog.BlogPostClass_title =
> property_Blog.BlogPostClass_title__,
> property_Blog.BlogPostClass_title_en,
> property_Blog.BlogPostClass_title_fr, ...
> 
> I'll keep you posted.
> 
> Thanks,
> Marius
> 
>> 
>> Ludovic
>> 
>> 
>> 2013/10/14 Marius Dumitru Florea <[email protected]>
>> 
>>> I started writing
>>> http://dev.xwiki.org/xwiki/bin/view/Design/SolrSchema . I need help
>>> with two things:
>>> 
>>> * test cases
>>> http://dev.xwiki.org/xwiki/bin/view/Design/SolrSchema#HTestCases
>>> * if time permits, review the proposal, especially
>>> http://dev.xwiki.org/xwiki/bin/view/Design/SolrSchema#HAMixedApproach
>>> .
>>> 
>>> Thanks,
>>> Marius
>>> 
>>> 
>>> On Fri, Oct 11, 2013 at 12:55 PM, Marius Dumitru Florea
>>> <[email protected]> wrote:
>>>> Hi devs,
>>>> 
>>>> This is a very important question so think carefully. Let me explain:
>>>> 
>>>> In XWiki (model) we have a few entity types. There are *wikis* which
>>>> have *spaces* which have *documents*. A document can have *objects*
>>>> and *attachments*. A document can also define a *class*.
>>>> 
>>>> At the same time we like to say that in XWiki "everything is a
>>>> document" because everything revolves around documents. The document
>>>> is the central notion.
>>>> 
>>>> We can query the database (using HQL or XWQL) for any of the
>>>> previously mentioned entities but what should a Solr query return
>>>> (semantically)? In other words:
>>>> 
>>>> * are you searching for an object without caring about the document
>>>> that holds the object? Same for an object property.
>>>> * how often are you searching for an attachment without caring about
>>>> the document that holds the attachment?
>>>> * are you searching for a class or for the document that defines that
>>> class?
>>>> * are you searching for a wiki without caring about the documents it
>>>> contains? Same for a space.
>>>> 
>>>> IMO the result of a Solr query should be, semantically, a list of
>>>> documents. But maybe I'm wrong.
>>>> 
>>>> -----------------------
>>>> Technical Details
>>>> -----------------------
>>>> 
>>>> Unlike a relational database, Solr/Lucene index has a single 'table'.
>>>> So normally you index a single entity type. Each row in the index
>>>> represents an entity of that type. As a consequence the result of a
>>>> Solr query is semantically a list of entities of that type. In our
>>>> case the entity type is (naturally) *document*.
>>>> 
>>>> If you want to index more entity types (e.g. index attachments and
>>>> objects _separately_, not as part of a document) then, since there is
>>>> only one 'table' in the index, you need to add a 'type' column that
>>>> specifies the type of entity you have on each row (e.g. type=document,
>>>> type=attachment, type=object etc.). The result of a Solr query is now,
>>>> semantically, a list of different entity types, unless you filter by a
>>>> specific type. It smells like a hack to me.
>>>> 
>>>> Let's imagine what happens if we want to search for blog posts that
>>>> has a specific tag. With the first approach this is easy because all
>>>> the (indexed) information is on a single row. With the second approach
>>>> this is considerably more complex because the information is spread on
>>>> multiple rows:
>>>> 
>>>> * one row with type=document for the blog post document
>>>> * one row with type=object for the blog post object
>>>> * one row with type=object for the tab object
>>>> 
>>>> In a relational database when you have the information spread in
>>>> multiple places (tables) you do joins. Fortunately (you would says)
>>>> Solr supports joins. In this particular case we would have to perform
>>>> 2 joins which means:
>>>> 
>>>> index X index X index
>>>> 
>>>> where X represents the cartesian product. The document name would be
>>>> the join key. Pretty complex even before trying to write this in Solr
>>>> query syntax..
>>>> 
>>>> So basically the question becomes: is it worth indexing more entities
>>>> _separately_ instead of indexing just documents (with info about their
>>>> objects and attachments) considering the complexity that it brings in
>>>> writing Solr queries? Do we search for objects and attachments alone
>>>> as separate entities often enough to justify this complexity? My
>>>> answer is no.
>>>> 
>>>> Thanks,
>>>> Marius
>>> _______________________________________________
>>> devs mailing list
>>> [email protected]
>>> http://lists.xwiki.org/mailman/listinfo/devs
>>> 
>> 
>> 
>> 
>> --
>> Ludovic Dubost
>> Founder and CEO
>> Blog: http://blog.ludovic.org/
>> XWiki: http://www.xwiki.com
>> Skype: ldubost GTalk: ldubost
>> _______________________________________________
>> devs mailing list
>> [email protected]
>> http://lists.xwiki.org/mailman/listinfo/devs
> _______________________________________________
> devs mailing list
> [email protected]
> http://lists.xwiki.org/mailman/listinfo/devs

_______________________________________________
devs mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/devs

Reply via email to