Hi Xavier,

Dremio is looking really interesting and has nice UI. I think the idea to
replace SSIS or similar tools with Dremio is not so bad, but what about
complex scenarios with a lot of code and transformations ?
Is it possible to use Dremio via API and define own transformations and
transformation workflows with Java or Scala in Dremio?
I am not sure, if it is supported at all.
I think Dremio guys are looking forward to give users access to Sabot API
in order to use Dremio in the same way you can use spark, but I am not sure
if it is possible now.
Have you also tried comparing performance with Spark ? Are there any
benchmarks ?

Best,
Michael

On Mon, May 14, 2018 at 6:53 AM, xmehaut <xavier.meh...@gmail.com> wrote:

> Hello,
> I've some question about Spark and Apache Arrow. Up to now, Arrow is only
> used for sharing data between Python and Spark executors instead of
> transmitting them through sockets. I'm studying currently Dremio as an
> interesting way to access multiple sources of data, and as a potential
> replacement of ETL tools, included sparksql.
> It seems, if the promises are actually right, that arrow and dremio may be
> changing game for these two purposes (data source abstraction, etl tasks),
> leaving then spark on te two following goals , ie ml/dl and graph
> processing, which can be a danger for spark at middle term with the arising
> of multiple frameworks in these areas.
> My question is then :
> - is there a means to use arrow more broadly in spark itself and not only
> for sharing data?
> - what are the strenghts and weaknesses of spark wrt Arrow and consequently
> Dremio?
> - What is the difference finally between databricks DBIO and Dremio/arrow?
> -How do you see the future of spark regarding these assumptions?
> regards
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Reply via email to