Simpler possibly, but not necessarily reliable. If you do everything
inside Solr's DIH with Tika under the hood to extract data from Excel, a
malformed Excel file could kill Tika and bring down your entire Solr
cluster. Far better to do it outside of Solr as this blog describes:
Hi Charlie,
Thanks for your suggestion, but I will have thousands of these files
coming from different sources. It would become very tedious if I have to
first convert them to csv and then run liny by line.
I was hoping if there could be a simpker way to achieve these using DIH
which I thought
Convert the Excel file to a CSV and then write a teeny script to go
through it line by line and submit to Solr over HTTP? Tika would
probably work but it's a lot of heavy lifting for what seems to me like
a simple problem.
Cheers
Charlie
On 26/07/2019 09:19, Vipul Bahuguna wrote:
Hi Guys -
Hi Guys - can anyone suggest how to achieve this?
I have understood how to insert json documents. So one alternative that
comes to my mind is that I can convert the rows in my excel to json format
with the header of my excel file becoming the json keys (corresponding to
the fields I have defined