OData compliant API for Spark

2018-12-04 Thread Affan Syed
All, We have been thinking about exposing our platform for analytics an OData server (for its ease of compliance with 3rd party BI tools like Tableau, etc) -- so Livy is not in the picture right now. Has there been any effort on this regards? Is there any interest or has there been any

[ANNOUNCE] Apache Bahir 2.3.1 Released

2018-12-04 Thread Luciano Resende
Apache Bahir provides extensions to multiple distributed analytic platforms, extending their reach with a diversity of streaming connectors and SQL data sources. The Apache Bahir community is pleased to announce the release of Apache Bahir 2.3.1 which provides the following extensions for Apache

[ANNOUNCE] Apache Bahir 2.3.2 Released

2018-12-04 Thread Luciano Resende
Apache Bahir provides extensions to multiple distributed analytic platforms, extending their reach with a diversity of streaming connectors and SQL data sources. The Apache Bahir community is pleased to announce the release of Apache Bahir 2.3.2 which provides the following extensions for Apache

[ANNOUNCE] Apache Bahir 2.3.0 Released

2018-12-04 Thread Luciano Resende
Apache Bahir provides extensions to multiple distributed analytic platforms, extending their reach with a diversity of streaming connectors and SQL data sources. The Apache Bahir community is pleased to announce the release of Apache Bahir 2.3.0 which provides the following extensions for Apache

[Spark Structured Streaming] Dynamically changing maxOffsetsPerTrigger

2018-12-04 Thread subramgr
Is there a way to dynamically change the value of *maxOffsetsPerTrigger* ? -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Recommended Node Usage

2018-12-04 Thread Hans Fischer
Dear Spark-Community, is it recommended (and why) to use a hardware node by 100% (put one or more vcores onto every hardware-core) , instead of using the node by spark at 95% system-load to gain a better system stability? Thank you very much for your contribution, Hans

Unsubscribe

2018-12-04 Thread GmailLiang
Unsubscribe Sent from Tianchu(Alex) iPhone On Dec 4, 2018, at 00:00, Nirmal Manoharan wrote: I am trying to deduplicate on streaming data using the dropDuplicate function with watermark. The problem I am facing currently is that I have to two timestamps for a given record 1. One is the

Re: Job hangs in blocked task in final parquet write stage

2018-12-04 Thread Conrad Lee
Yeah, probably increasing the memory or increasing the number of output partitions would help. However increasing memory available to each executor would add expense. I want to keep the number of partitions low so that each parquet file turns out to be around 128 mb, which is best practice for

unsubscribe

2018-12-04 Thread Junior Alvarez