Hey all,   I wanted to load some JSON docs into Solr and as I load them, do 
some manipulations to the documents as they go in.   I looked at 
https://lucene.apache.org/solr/guide/8_6/transforming-and-indexing-custom-json.html
 
<https://lucene.apache.org/solr/guide/8_6/transforming-and-indexing-custom-json.html>,
 however I also wanted to see if Streaming would help.

I’ve used the combination of cat and parseCSV streaming functions successfully 
to load data into Solr, so I looked a bit at what we could do with JSON source 
format.

I didn’t see an obvious path for taking a .json file and loading it, so I 
played around and made this JSON w/ Lines formatted file streaming expression: 
https://github.com/epugh/playing-with-solr-streaming-expressions/pull/3 
<https://github.com/epugh/playing-with-solr-streaming-expressions/pull/3>

The expression looks like
commit(icecat,
  update(icecat,
    parseJSONL(
      cat('two_docs.jsonl')
    )
  )
)
I was curious what other folks have done?  I saw that there is a 
JSONTupleStream, but it didn’t quite seem to fit the need.

Eric

_______________________
Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | 
My Free/Busy <http://tinyurl.com/eric-cal>  
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
<https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
    
This e-mail and all contents, including attachments, is considered to be 
Company Confidential unless explicitly stated otherwise, regardless of whether 
attachments are marked as such.

Reply via email to