Re: Indexig excel (xlsx) file into SOLR 8.1.1

2019-07-26 Thread Charlie Hull
Simpler possibly, but not necessarily reliable. If you do everything inside Solr's DIH with Tika under the hood to extract data from Excel, a malformed Excel file could kill Tika and bring down your entire Solr cluster. Far better to do it outside of Solr as this blog describes:

Re: Indexig excel (xlsx) file into SOLR 8.1.1

2019-07-26 Thread Vipul Bahuguna
Hi Charlie, Thanks for your suggestion, but I will have thousands of these files coming from different sources. It would become very tedious if I have to first convert them to csv and then run liny by line. I was hoping if there could be a simpker way to achieve these using DIH which I thought

Re: Indexig excel (xlsx) file into SOLR 8.1.1

2019-07-26 Thread Charlie Hull
Convert the Excel file to a CSV and then write a teeny script to go through it line by line and submit to Solr over HTTP? Tika would probably work but it's a lot of heavy lifting for what seems to me like a simple problem. Cheers Charlie On 26/07/2019 09:19, Vipul Bahuguna wrote: Hi Guys -

Indexig excel (xlsx) file into SOLR 8.1.1

2019-07-26 Thread Vipul Bahuguna
Hi Guys - can anyone suggest how to achieve this? I have understood how to insert json documents. So one alternative that comes to my mind is that I can convert the rows in my excel to json format with the header of my excel file becoming the json keys (corresponding to the fields I have defined