This is an interesting discussion.

There is are really nice way to build up an expression from inside a
client. Take a look at the toExpression() method in every TupleStream impl
and you'll see how to build a StreamExpression. A StreamExpression is an
intermediate format that can either become a live TupleStream or the String
expression.

One of the key features coming very soon is a REPL client, that will
support interactively building up expressions. This was made possible by
the introduction of variables in Solr 6.6 described here: (
http://joelsolr.blogspot.com/2017/05/exploring-solrs-new-time-series-and.html
).

The REPL client will be a java command line client that lives outside Solr
and can connect to any SolrCloud via it's ZooKeeper URL.

The REPL client will also ship with Solrj.










Joel Bernstein
http://joelsolr.blogspot.com/

On Wed, Apr 26, 2017 at 10:55 AM, Dyer, James <[email protected]>
wrote:

> Thank you for the quick replies.  I can see how it would be powerful to be
> able to execute streaming expressions outside of solr, giving yourself the
> option of moving some of the work to the client.  I wouldn't necessarily
> tie it into core because being able to join a solr stream with a rdbms
> result -- either within solr, or in your driver program -- that could be a
> nice set of options to have.  But the patch on SOLR-1015 seems to get this
> right in (it seems from a quick look) that it uses the core's classloader
> when it is available, and falls back when it is not.  It might be nice --
> especially as the streaming code base grows -- to consider packaging it
> separately from the solrj client itself.
>
>
>
> Along these lines:  I was initially confused by the examples in
> https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions in
> that the cURL example at the top is materially different from the SolrJ
> example following it.  That is, with the cURL example, all of the work
> occurs in Solr and only the final result is streamed back.  With the SolrJ
> example, some of that work is now being done in the client.  This is easy
> to discover if you try the JDBC expression:  following the cURL example,
> the query originates in Solr ; on the SolrJ example, the query originates
> on the client -- the server has no involvement at all.
>
>
>
> Is my understanding here correct?  I can see how this design has great
> advantage as it gives us the ability to write driver programs that use the
> solr cores as worker nodes.  But this wasn't immediately clear to me.  I
> also wonder:  do we have an (easy) way with SolrJ currently to simply
> execute a (chain of) streaming expression(s) and get the result back, like
> in the cURL example (besides using JDBC)?
>
>
>
> *James Dyer*
>
> Ingram Content Group
>
>
>
> *From:* Joel Bernstein [mailto:[email protected]]
> *Sent:* Tuesday, April 25, 2017 6:25 PM
> *To:* lucene dev <[email protected]>
> *Subject:* Re: JDBCStream and loading drivers
>
>
>
> There are a few stream impl's that have access to SolrCore
> (ClassifyStream, AnalyzeEvaluator) because they use analyzers. These
> classes have been added to core. We could move the JdbcStream to core as
> well if it makes the user experience nicer.
>
>
>
> Originally the idea was that you could run the Streaming API Java classes
> like you would other Solrj clients. I think over time this may become
> important again, as I believe there is work underway for spinning up worker
> nodes that are not attached to a SolrCore.
>
>
> Joel Bernstein
>
> http://joelsolr.blogspot.com/
>
>
>
> On Tue, Apr 25, 2017 at 3:25 PM, Dyer, James <[email protected]>
> wrote:
>
> Using JDBCStream, Solr cannot find my database driver if I put the .jar in
> the shared lib directory ($SOLR_HOME/lib).  In order for the classloader to
> find it, the driver has to be in the server's lib directory.  Looking at
> why, I see that to get the full classpath, including what is in the shared
> lib directory, we'd typically get a reference to a SolrCore, call
> "getResourceLoader" and then "findClass".  This makes use of the
> URLClassLoader that knows about the shared lib.
>
>
>
> But fixing JDBCStream to do this might not be so easy?  Best I can tell,
> Streaming Expressions are written nearly stand-alone as client code that
> merely executes in the Solr JVM.  Is this correct?  Indeed, the code itself
> is included with the client, in the SolrJ package, despite it mostly being
> server-side code … Maybe I misunderstand?
>
>
>
> On the one hand, it isn't a huge deal as to where you need to put your
> drivers to make this work.  But on the other hand, it isn't really the best
> user experience, in my opinion at least, to have to dig around the server
> directories to find where your driver needs to go.  And also, if this is
> truly server-side code, why do we ship it with the client jar?  Unless
> there is a desire to make a stand-alone Streaming Expression engine that
> interacts with Solr as a client, would it be acceptable to somehow expose
> the SolrCore to it for loading resources like this?
>
>
>
> *James Dyer*
>
> Ingram Content Group
>
>
>
>
>
>
>

Reply via email to