Re: [orientdb] Some thought regardinthe usability of fulltext search in a graph vs. document environment

stefan Mon, 26 May 2014 09:35:29 -0700

let's make that Riccardo ;)

On Monday, 26 May 2014 09:05:40 UTC, [email protected] wrote:
>
> Hi Recardo,
>
> Thank you for taking the time :).
>
> If the information could be assembled into a single document for indexing 
> only then it would return a pointer to the master vertex but it could 
> return the virtual document as we. 
> If the temporary document is stored on the indexing site then the index 
> grows bigger but the advantage is that you have a "preview" document 
> available for search results.
>
> We used the latter approach but the project we are working on now is too 
> large for that to be feasible.
>
> If the information is combined in this way the novice users (regular 
> employees) can compose a Lucene query which has a syntax close enough to 
> advanced search in google to be "acceptable" to them.
>
> It's far fetched to write a query parser, for simple queries, that allows 
> search over fulltext+graph. The only alternative that has come to mind is 
> to create this "searchable document" that contains information from various 
> vertices.
>
> I'm a OrientDB fan. One of the biggest reasons for that is this hybrid 
> model of documents with graph based relations. The fulltext capabilities of 
> competing document-only stores (Elastic Search/Solr) offer so much more 
> usability then this can offer without additional functionality like this.
>
> I'm not sure if this is the right approach. The only thing I'm pretty sure 
> of is that something like this is needed to complete the merger of the two 
> domains (documents+graphs).
>
> Regards,
>   -Stefán
>
>
> On Monday, 26 May 2014 06:57:00 UTC, Riccardo Tasso wrote:
>>
>> Very interesting post.
>>
>> In my application, based on OrientDB and FULLTEXT indices (not lucene by 
>> now), I recently had a similar experience.
>>
>> I indexed the field "title" of a document, but the user expected to find 
>> a document also when a given field of a connected vertex matched the text 
>> query. I don't remember if it's possible, but I'd like to achieve this 
>> result with a query of the form:
>>
>> SELECT FROM Person WHERE name containsText "Garibaldi" OR 
>> out('address').street containsText "Garibaldi"
>>
>> When you write a query, the result represent a "virtual document", so I 
>> don't think there will be required extra functionalities.
>>
>> Do you agree?
>> Cheers,
>>    Riccardo
>>
>>
>> 2014-05-24 13:53 GMT+02:00 <[email protected]>:
>>
>>> Hi,
>>>
>>> I'm one of the OrientDB users that celebrates that Lucene can now be 
>>> used for fulltext indexing in OrientDB, thank you!
>>>
>>> I have been using Solr for some years and used it, for example, to build 
>>> extensive an "Enterprise search" where it shined (powered by Lucene).
>>>
>>> One of the things we found was that the MultiFieldQueryParser was quite 
>>> helpful as it provided users/employees a fairly simple and powerful way to 
>>> narrow their search for entities/documents.
>>>
>>> The usability of fulltext search is diminishes somewhat when it creation 
>>> relies on information stored in a graph as related information, like 
>>> address+postcodes, have been spread out/normalized over many vertices and 
>>> edges.
>>> A search for "John that lives on Pine* in 980201" can no longer be built 
>>> using the index or the MultiFieldQueryParser, at least not without 
>>> combining OSQL with the Lucene query.
>>>
>>> What I guess I'm trying to say is that the fulltext-document-search that 
>>> shines when it's based on documents (take Solr/Elastic Search for example) 
>>> is rendered quite limited when used on to of a graph if it can only consist 
>>> of information from a single vertex.
>>>
>>> The ability to create a virtual/temporary document for fulltext 
>>> indexing, from the information of a vertex and the adjacent vertices, is 
>>> quite appealing to me but and it would bridge the gap between document and 
>>> graph strengths and weaknesses when it comes to fulltext search.
>>>
>>> I realize that there is a line between what the database it self should 
>>> do and what the users need to do by them selves, but keeping in mind that 
>>> OrientDB's main differentiation is it's mix of a document store and graph I 
>>> think that a more powerful Fultext feature, that takes these differences 
>>> into account, could help establish it as a clear winner in both spaces.
>>>
>>> There are many small projects, like Django-Haystack, that focus on the 
>>> ability to create virtual/temporary documents for indexing-only purposes 
>>> that might be helpful in evaluating options to improve this.
>>>
>>> Please let me know if anyone else here shares this view or, better yet, 
>>> has devices a simple way around this limitation.
>>>
>>> Very best regards,
>>>   -Stefán
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>  -- 
>>>
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "OrientDB" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>


-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [orientdb] Some thought regardinthe usability of fulltext search in a graph vs. document environment

Reply via email to