Re: R: info on queries and index

Michael Marth Sat, 27 Feb 2016 14:00:17 -0800

Hi,

the simplest approach is to just use the built-in Lucene. That pretty much 
rules out the problems you mention (external server overloaded or not 
reachable).
Loosing an index is a problem in any architecture. Re-indexing would happen 
faster with the built-in Lucene as no content has to transported over a network.
Solr is a useful option if you intend to leverage Solr-specific features that 
do not exist in Lucene.


HTH
Michael




On 26/02/16 02:17, "Ancona Francesco" <[email protected]> wrote:

>Hello,
>so if Lucene or Solr have a problem or are busy for some reasons, we can't 
>search anything, if i understand.
>
>So, i imagine, we have to be very careful to the search engine that is a 
>potential single point of failure if it goes down or if loose index and so it 
>has to make a full reindex.
>
>What kind of topology (application and search engine) do you suggest to 
>mitigate this problem ?
>
>Thanks in advance,
>best regards
>
>-----Messaggio originale-----
>Da: Davide Giannella [mailto:[email protected]] 
>Inviato: venerdì 26 febbraio 2016 10:17
>A: [email protected]
>Oggetto: Re: info on queries and index
>
>On 25/02/2016 16:40, Ancona Francesco wrote:
>>
>> Hello,
>>
>> we'd like to study in deep queries and index. In particular is not so 
>> clear, from documentation, what is indexed by default.
>>
>> For instance if i create a new type of document (IdentityCard with 
>> name IDC) with 2 new properties (idCard and idGeneralAnagrafic) are 
>> these data (ie metadata) indexed ?
>>
>Short answer: oak does not index anything by default.
>
>Long one. It depends by how you construct the repository. For example if you 
>build a JCR repository by providing the InitialContent RepositoryInitializer 
>(0), you'll see that it creates some index definitions (1): uuid, nodetype and 
>counter.
>
>(0) https://goo.gl/MNpam7
>(1) https://goo.gl/G6RChL
>>
>>  
>>
>> And in that case the search is delegated to db (mongo or postgres that 
>> store metadata) or is delegated to solr or lucene ?
>>
>As it is now, Oak does not delegate to the persistence any of the searches. We 
>don't have plans to do so as far as I know. In oak we have mainly 2 types of 
>indexs: PropertyIndex and  LuceneIndex. You can find more details starting 
>from (3)
>
>(3) http://goo.gl/vfMJm3
>
>>  Finally, if we store a few million documents, what kind of strategy 
>> would you suggest for the search?
>>
>>  
>>
>The main strategy around searches is that the query is faster then the index 
>is small. So fine tuning the indexes is the main strategy around fast queries. 
>Depending on the index you use it will make sense one strategy versus the 
>other. As a rule of thumb I'd say that doesn't matter what index you use for 
>as long as you keep the content with a decent structure. For example 
>LucenePropertyIndex can evaluate multiple conditions and path restrictions as 
>well.
>
>When defining an index you can specify for what path they should index making 
>therefore the index as accurate as possible. It's a tradeoff you'll have to 
>find yourself as with all the performance tuning. Again I'd start with (3).
>
>HTH
>Davide
>
> 
> 
>************************************************************************************
>This footnote confirms that this email message has been scanned by PineApp 
>Mail-SeCure for the presence of malicious code, vandals & computer viruses.
>************************************************************************************
>
>
>

Re: R: info on queries and index

Reply via email to