Thanks for your note, Anand.  What was the maximum chunk size for you?  Could 
you post the relevant portions of your configuration file?


Thanks!
Pete

On Oct 21, 2011, at 4:20 AM, anand.ni...@rbs.com wrote:

> Hi,
> 
> I was also facing the issue of highlighting the large text files. I applied 
> the solution proposed here and it worked. But I am getting following error :
> 
> 
> Basically 'hitGrouped.vm' is not found. I am using solr-3.4.0. Where can I 
> get this file from. Its reference is present in browse.vm
> 
> <div class="results">
>  #if($response.response.get('grouped'))
>    #foreach($grouping in $response.response.get('grouped'))
>      #parse("hitGrouped.vm")
>    #end
>  #else
>    #foreach($doc in $response.results)
>      #parse("hit.vm")
>    #end
>  #end
> </div>
> 
> 
> HTTP Status 500 - Can't find resource 'hitGrouped.vm' in classpath or 
> 'C:\caprice\workspace\caprice\dist\DEV\solr\.\conf/', 
> cwd=C:\glassfish3\glassfish\domains\domain1\config 
> java.lang.RuntimeException: Can't find resource 'hitGrouped.vm' in classpath 
> or 'C:\caprice\workspace\caprice\dist\DEV\solr\.\conf/', 
> cwd=C:\glassfish3\glassfish\domains\domain1\config at 
> org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:268)
>  at 
> org.apache.solr.response.SolrVelocityResourceLoader.getResourceStream(SolrVelocityResourceLoader.java:42)
>  at org.apache.velocity.Template.process(Template.java:98) at 
> org.apache.velocity.runtime.resource.ResourceManagerImpl.loadResource(ResourceManagerImpl.java:446)
>  at 
> 
> Thanks & Regards,
> Anand
> Anand Nigam
> RBS Global Banking & Markets
> Office: +91 124 492 5506   
> 
> 
> -----Original Message-----
> From: karsten-s...@gmx.de [mailto:karsten-s...@gmx.de] 
> Sent: 21 October 2011 14:58
> To: solr-user@lucene.apache.org
> Subject: Re: Can Solr handle large text files?
> 
> Hi Peter,
> 
> highlighting in large text files can not be fast without dividing the 
> original text in small piece.
> So take a look in
> http://xtf.cdlib.org/documentation/under-the-hood/#Chunking
> and in
> http://www.lucidimagination.com/blog/2010/09/16/2446/
> 
> Which means that you should divide your files and use Result Grouping / Field 
> Collapsing to list only one hit per original document.
> 
> (xtf also would solve your problem "out of the box" but xtf does not use 
> solr).
> 
> Best regards
>  Karsten
> 
> -------- Original-Nachricht --------
>> Datum: Thu, 20 Oct 2011 17:59:04 -0700
>> Von: Peter Spam <ps...@mac.com>
>> An: solr-user@lucene.apache.org
>> Betreff: Can Solr handle large text files?
> 
>> I have about 20k text files, some very small, but some up to 300MB, 
>> and would like to do text searching with highlighting.
>> 
>> Imagine the text is the contents of your syslog.
>> 
>> I would like to type in some terms, such as "error" and "mail", and 
>> have Solr return the syslog lines with those terms PLUS two lines of context.
>> Pretty much just like Google's highlighting.
>> 
>> 1) Can Solr handle this?  I had extremely long query times when I 
>> tried this with Solr 1.4.1 (yes I was using TermVectors, etc.).  I 
>> tried breaking the files into 1MB pieces, but searching would be wonky 
>> => return the wrong number of documents (ie. if one file had a term 5 
>> times, and that was the only file that had the term, I want 1 result, not 5 
>> results).
>> 
>> 2) What sort of tokenizer would be best?  Here's what I'm using:
>> 
>>   <field name="body" type="text_pl" indexed="true" stored="true"
>> multiValued="false" termVectors="true" termPositions="true" 
>> termOffsets="true" />
>> 
>>    <fieldType name="text_pl" class="solr.TextField">
>>      <analyzer>
>>        <tokenizer class="solr.StandardTokenizerFactory"/>
>>        <filter class="solr.LowerCaseFilterFactory"/>
>>        <filter class="solr.WordDelimiterFilterFactory"
>> generateWordParts="0" generateNumberParts="0" catenateWords="0" 
>> catenateNumbers="0"
>> catenateAll="0" splitOnCaseChange="0"/>
>>      </analyzer>
>>    </fieldType>
>> 
>> 
>> Thanks!
>> Pete
> 
> ***********************************************************************************
>  
> The Royal Bank of Scotland plc. Registered in Scotland No 90312. 
> Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. 
> Authorised and regulated by the Financial Services Authority. The 
> Royal Bank of Scotland N.V. is authorised and regulated by the 
> De Nederlandsche Bank and has its seat at Amsterdam, the 
> Netherlands, and is registered in the Commercial Register under 
> number 33002587. Registered Office: Gustav Mahlerlaan 350, 
> Amsterdam, The Netherlands. The Royal Bank of Scotland N.V. and 
> The Royal Bank of Scotland plc are authorised to act as agent for each 
> other in certain jurisdictions. 
> 
> This e-mail message is confidential and for use by the addressee only. 
> If the message is received by anyone other than the addressee, please 
> return the message to the sender by replying to it and then delete the 
> message from your computer. Internet e-mails are not necessarily 
> secure. The Royal Bank of Scotland plc and The Royal Bank of Scotland 
> N.V. including its affiliates ("RBS group") does not accept responsibility 
> for changes made to this message after it was sent. For the protection
> of RBS group and its clients and customers, and in compliance with
> regulatory requirements, the contents of both incoming and outgoing
> e-mail communications, which could include proprietary information and
> Non-Public Personal Information, may be read by authorised persons
> within RBS group other than the intended recipient(s). 
> 
> Whilst all reasonable care has been taken to avoid the transmission of 
> viruses, it is the responsibility of the recipient to ensure that the onward 
> transmission, opening or use of this message and any attachments will 
> not adversely affect its systems or data. No responsibility is accepted 
> by the RBS group in this regard and the recipient should carry out such 
> virus and other checks as it considers appropriate. 
> 
> Visit our website at www.rbs.com 
> 
> ***********************************************************************************
>   
> 

Reply via email to