Hey all, I wanted to load some JSON docs into Solr and as I load them, do some manipulations to the documents as they go in. I looked at https://lucene.apache.org/solr/guide/8_6/transforming-and-indexing-custom-json.html <https://lucene.apache.org/solr/guide/8_6/transforming-and-indexing-custom-json.html>, however I also wanted to see if Streaming would help.
I’ve used the combination of cat and parseCSV streaming functions successfully to load data into Solr, so I looked a bit at what we could do with JSON source format. I didn’t see an obvious path for taking a .json file and loading it, so I played around and made this JSON w/ Lines formatted file streaming expression: https://github.com/epugh/playing-with-solr-streaming-expressions/pull/3 <https://github.com/epugh/playing-with-solr-streaming-expressions/pull/3> The expression looks like commit(icecat, update(icecat, parseJSONL( cat('two_docs.jsonl') ) ) ) I was curious what other folks have done? I saw that there is a JSONTupleStream, but it didn’t quite seem to fit the need. Eric _______________________ Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | My Free/Busy <http://tinyurl.com/eric-cal> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.