Solr/Tika config question

2019-03-14 Thread Paul Buiocchi
Greetings,
I am setting up solr 8 on a vanilla Linux Ubuntu server (16.04)
The whole reason for the setup is to index 1000s of PDF files (newspaper scans).
- I created my core and have Solr up and running.- I am assuming that I need 
Apache Tika to index the files-Do I tie Tika into Solr via the SOLCONFIG.XML 
file ?-if so , does anyone have a sample syntax ?-Tika.jar ? Server or client ?
- are there other PDF converters other than Tika . if so how do they compare.
Any other advice /suggestions 
Thank you all , I really appreciate the help !

Sent from Yahoo Mail on Android

Re: Question on Solr/WordPress Integration

2019-03-01 Thread Paul Buiocchi
Thank you Shawn !

Sent from Yahoo Mail on Android 
 
  On Fri, Mar 1, 2019 at 12:25 PM, Paul Buiocchi 
wrote:   Greetings, 

I have a couple of questions about Solr /Wordpress integration - 

First , I am not "committed to using WordPress as a front end. If there is a 
better front end option , I would be willing to convert. For functionality , 
all I am looking for is the ability to full txt search , highlight the search 
terms in the search results  It should be pretty simple , maybe I am 
overanalyzing it  ...Looking for as much "out of the box" as possible 

My scenario is this: 

I am putting together an old newspaper archive site . about 25k pdf files that 
are full txt searchable. 

Questions on architecture: 
1) Is there a way for Solr to index from a local file structure i.e local 
drive:/newpaper_name/date/page# ? . From the experimenting I have done with 
Wordpress/Solr integration , I found that I had to upload the documents in 
Wordpress to get Solr to recognize them . 

I'm sure I will have more questions , any help/suggestions would be greatly 
appreciated - thank you  

Sent from Yahoo Mail on Android  


Question on Solr/WordPress Integration

2019-03-01 Thread Paul Buiocchi
Greetings, 

I have a couple of questions about Solr /Wordpress integration - 

First , I am not "committed to using WordPress as a front end. If there is a 
better front end option , I would be willing to convert. For functionality , 
all I am looking for is the ability to full txt search , highlight the search 
terms in the search results  It should be pretty simple , maybe I am 
overanalyzing it  ...Looking for as much "out of the box" as possible 

My scenario is this: 

I am putting together an old newspaper archive site . about 25k pdf files that 
are full txt searchable. 

Questions on architecture: 
1) Is there a way for Solr to index from a local file structure i.e local 
drive:/newpaper_name/date/page# ? . From the experimenting I have done with 
Wordpress/Solr integration , I found that I had to upload the documents in 
Wordpress to get Solr to recognize them . 

I'm sure I will have more questions , any help/suggestions would be greatly 
appreciated - thank you  

Sent from Yahoo Mail on Android