[ https://issues.apache.org/jira/browse/SOLR-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dennis Gove updated SOLR-9525: ------------------------------ Attachment: SOLR-9525.patch Full implementation and tests for a split operation. Because it's implementing as an operation this will work as part of a select(....) stream. Valid expression forms: {code} split(fieldA, on=",") // replace value of fieldA with List<String> of split values split(fieldA, on=",", as="fieldB") // splits value of fieldA into List<String> and puts into fieldB {code} > split() function for streaming > ------------------------------ > > Key: SOLR-9525 > URL: https://issues.apache.org/jira/browse/SOLR-9525 > Project: Solr > Issue Type: Wish > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Mike Thomsen > Attachments: SOLR-9525.patch > > > This is the original description I posted on solr-user: > Read this article and thought it could be interesting as a way to do > ingestion: > https://dzone.com/articles/solr-streaming-expressions-for-collection-auto-upd-1 > Example from the article: > daemon(id="12345", > runInterval="60000", > update(users, > batchSize=10, > jdbc(connection="jdbc:mysql://localhost/users?user=root&password=solr", > sql="SELECT id, name FROM users", sort="id asc", > driver="com.mysql.jdbc.Driver") > ) > What's the best way to handle a multivalue field using this API? Is there a > way to tokenize something returned in a database field? > Joel Bernstein responded with this: > Unfortunately there currently isn't a way to split a field. But this would > be nice functionality to add. > The approach would be to an add a split operation that would be used by the > select() function. It would look like this: > select(jdbc(...), split(fieldA, delim=","), ...) > This would make a good jira issue. > So the TL;DR version is that I need the ability to specify in such a > streaming operation certain fields to tokenize into multivalue fields. In one > schema I may have to support, there are probably a half a dozen such fields. > Perhaps I am missing a feature here, but until this is done it looks like > this new capability cannot handle multivalue fields until something like this > is in place. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org