Can you get Delivery Server to generate Solr-style XML or JSON update
file? Might be easier than generating and then re-parsing HTML?

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Thu, Mar 27, 2014 at 3:28 PM, Michael Clivot <cli...@netmedia.de> wrote:
> Thanks for your answer Jack.
> @Gora:
>
>> How are you fetching the HTML content, and indexing it into Solr?
>
> We are using SolR with the OpenText Delivery Server. The Delivery Server 
> generated HTML representations of the published pages and writes them to the 
> directory, which is used by solr to get data content.
>
>> It is probably best to handle this requirement at that point. Haven't used 
>> Nutch ( http://nutch.apache.org/) recently, but you might be able to use it 
>> for this.
>
> Do you mean the web crawler way? From the first view, it fits us not very 
> good. In this case we need to implement ourselves the OpenText Search layer. 
> Theoretically, we can try to teach DeliveryServer to understand external 
> indexes. But the crawling itself is not the preferred solution - it is not so 
> responsive, as the DS-way; in case of existing authorization restrictions, it 
> should be many crawler users for every role; etc...
>
> -----Ursprüngliche Nachricht-----
> Von: Gora Mohanty [mailto:g...@mimirtech.com]
> Gesendet: Dienstag, 25. März 2014 11:32
> An: solr-user@lucene.apache.org
> Betreff: Re: Indexing parts of an HTML file differently
>
> On 25 March 2014 15:59, Michael Clivot <cli...@netmedia.de> wrote:
>> Hello,
>>
>> I have the following issue and need help:
>>
>> One HTML file has different parts for different countries.
>> For example:
>>
>> <!-- Country: FR, BE --->
>> ....
>> Address for France and Benelux
>> ....
>> <!-- Country End -->
>> <!-- Country: CH -->
>> ....
>> Address for Switzerland
>> ....
>> <!-- Country End -->
>>
>> Depending on a parameter, I show or hide the parts on the website
>> Logically, all parts are in the index and therefore all items are found by 
>> SolR.
>> My question is: how can I have only the items for the current country in my 
>> result list?
>
> How are you fetching the HTML content, and indexing it into Solr?
> It is probably best to handle this requirement at that point. Haven't used 
> Nutch ( http://nutch.apache.org/ ) recently, but you might be able to use it 
> for this.
>
> Regards,
> Gora

Reply via email to