Femi, We have a solution that needs to be both on-prem and also in the cloud.
Not sure how that impacts anything, what we want is to run an analytical query on a large dataset (ours is over Cassandra) -- so batch in that sense, but think on-demand --- and then have the result be entirely (not first x number of rows) available for a web application to access the results. Web application work over a REST API, so while the query can be submitted through something like Livy or the thrift-server, the concern is how do we get the final result back to be useful. I could think of two ways of doing that. A global temp table would work, but that would be first point --- it seems a bit involved. My point was that, has someone solved that problem and run through all the steps? - Affan ᐧ On Thu, Oct 25, 2018 at 12:39 PM Femi Anthony < olufemi.anth...@capitalone.com> wrote: > What sort of environment are you running Spark on - in the cloud, on > premise ? Is its a real-time or batch oriented application? > Please provide more details. > Femi > > On Thu, Oct 25, 2018 at 3:29 AM Affan Syed <as...@an10.io> wrote: > >> Spark users, >> We really would want to get an input here about how the results from a >> Spark Query will be accessible to a web-application. Given Spark is a well >> used in the industry I would have thought that this part would have lots of >> answers/tutorials about it, but I didnt find anything. >> >> Here are a few options that come to mind >> >> 1) Spark results are saved in another DB ( perhaps a traditional one) and >> a request for query returns the new table name for access through a >> paginated query. That seems doable, although a bit convoluted as we need to >> handle the completion of the query. >> >> 2) Spark results are pumped into a messaging queue from which a socket >> server like connection is made. >> >> What confuses me is that other connectors to spark, like those for >> Tableau, using something like JDBC should have all the data (not the top >> 500 that we typically can get via Livy or other REST interfaces to Spark). >> How do those connectors get all the data through a single connection? >> >> >> Can someone with expertise help in bringing clarity. >> >> Thank you. >> >> Affan >> ᐧ >> ᐧ >> > > ------------------------------ > > The information contained in this e-mail is confidential and/or > proprietary to Capital One and/or its affiliates and may only be used > solely in performance of work or services for Capital One. The information > transmitted herewith is intended only for use by the individual or entity > to which it is addressed. If the reader of this message is not the intended > recipient, you are hereby notified that any review, retransmission, > dissemination, distribution, copying or other use of, or taking of any > action in reliance upon this information is strictly prohibited. If you > have received this communication in error, please contact the sender and > delete the material from your computer. >