You can index 2B tokens, so upping maxFieldLength should have
fixed your problem at least as far as Solr is concerned. How
many tokens get indexed? I'm not as familiar with Tika, but
there may be some kind of parameter there (although I
don't remember this coming up before)...

Did you restart Solr after making the change to solrconfig.xml?

If you're seeing 10,000 tokens or so, that's the default for
maxFieldLength....

I'd recommend stopping Solr, "rm -rf <solr home>/data/index"
and restarting Solr just to be sure you're not seeing leftover
junk, you'll have to re-index your docs after changing
the maxLength param.


Best
Erick


On Mon, Apr 2, 2012 at 7:19 AM, Manoj Saini <manoj.sa...@stigasoft.com> wrote:
> Hello Guys,
>
> I am using apache solr 3.3.0 with Tikka 1.0.
>
> I have pdf files which I am pushing into solr for conent searching. Apache
> solr is indexing pdf files and I can see them in apache solr admin interface
> for search. But the issue is apache solr is not indexing whole file content.
> It is indexing upto only limited size.
>
> Am I missing something, some configuration, or this is the behavior of
> apache solr?
>
> I have tried to update solrconfig.xml. I have updated ramBufferSizeMB,
> maxFieldLength.
>
> Thanks
> Manoj Saini
>
>
>
>
>
> Thanks,
>
> Best Regards,
>
>
>
> Manoj Saini | Sr. Software Engineer  | Stigasoft
>
> m: +91 98 1034 1281 |
>
> e:  <mailto:nseh...@stigasoft.com> manoj.sa...@stigasoft.com | w:
> <http://www.stigasoft.com> www.stigasoft.com
>
>
>

Reply via email to