from:"nitin"

Re: Parquet-like partitioning support in spark SQL's in-memory columnar cache

2016-11-28 Thread Nitin Goyal

s what you were refering to originally? Thanks -Nitin On Fri, Nov 25, 2016 at 11:29 AM, Reynold Xin <r...@databricks.com> wrote: > It's already there isn't it? The in-memory columnar cache format. > > > On Thu, Nov 24, 2016 at 9:06 PM, Nitin Goyal <nitin2go...@gmail.com>

Parquet-like partitioning support in spark SQL's in-memory columnar cache

2016-11-24 Thread Nitin Goyal

Hi, Do we have any plan of supporting parquet-like partitioning support in Spark SQL in-memory cache? Something like one RDD[CachedBatch] per in-memory cache partition. -Nitin

Continuous warning while consuming using new kafka-spark010 API

2016-09-19 Thread Nitin Goyal

ew API? Is this the expected behaviour or am I missing something here? -- Regards Nitin Goyal

Re: Ever increasing physical memory for a Spark Application in YARN

2016-05-03 Thread Nitin Goyal

Hi Daniel, I could indeed discover the problem in my case and it turned out to be a bug at parquet side and I had raised and contributed to the following issue :- https://issues.apache.org/jira/browse/PARQUET-353 Hope this helps! Thanks -Nitin On Mon, May 2, 2016 at 9:15 PM, Daniel Darabos

Re: Secondary Indexing of RDDs?

2015-12-14 Thread Nitin Goyal

Spar SQL's in-memory cache stores statistics per column which in turn is used to skip batches(default size 1) within partition https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/ColumnStats.scala#L25 Hope this helps Thanks -Nitin

Re: Operations with cached RDD

2015-10-11 Thread Nitin Goyal

ory)" written which means input data has been fetched from memory (your cached RDD). As far as lineage/call site is concerned, I think there was a change in spark 1.3 which excluded some classes from appearing in call site (I know that some Spark SQL related were removed for sure). Thanks -Nit

Re: [ compress in-memory column storage used in sparksql cache table ]

2015-09-02 Thread Nitin Goyal

I think spark sql's in-memory columnar cache already does compression. Check out classes in following path :- https://github.com/apache/spark/tree/master/sql/core/src/main/scala/org/apache/spark/sql/columnar/compression Although compression ratio is not as good as Parquet. Thanks -Nitin

Ever increasing physical memory for a Spark Application in YARN

2015-07-27 Thread Nitin Goyal

I am running a spark application in YARN having 2 executors with Xms/Xmx as 32 Gigs and spark.yarn.excutor.memoryOverhead as 6 gigs. I am seeing that the app's physical memory is ever increasing and finally gets killed by node manager 2015-07-25 15:07:05,354 WARN

Re: ClosureCleaner slowing down Spark SQL queries

2015-05-27 Thread Nitin Goyal

for a single query. I also looked at the fix's code diff and it wasn't related to the problem which seems to exist in Closure Cleaner code. Thanks -Nitin -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/ClosureCleaner-slowing-down-Spark-SQL-queries

Guidance for becoming Spark contributor

2015-04-10 Thread Nitin Mathur

Hi Spark Dev Team, I want to start contributing to Spark Open source. This is the first time I will be doing any open source contributions. It would be great if I can get some guidance on where I can start with. Thanks, - Nitin

Does Spark delete shuffle files of lost executor in running system(on YARN)?

2015-02-24 Thread nitin

? Thanks -Nitin -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Does-Spark-delete-shuffle-files-of-lost-executor-in-running-system-on-YARN-tp10755.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com

Spark SQL - Long running job

2015-02-21 Thread nitin

and going out of space as its a long running spark job. (running spark in yarn-client mode btw). Thanks -Nitin -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-SQL-Long-running-job-tp10717.html Sent from the Apache Spark Developers List mailing

Re: Parquet-like partitioning support in spark SQL's in-memory columnar cache

Parquet-like partitioning support in spark SQL's in-memory columnar cache

Continuous warning while consuming using new kafka-spark010 API

Re: Ever increasing physical memory for a Spark Application in YARN

Re: Secondary Indexing of RDDs?

Re: Operations with cached RDD

Re: [ compress in-memory column storage used in sparksql cache table ]

Ever increasing physical memory for a Spark Application in YARN

Re: ClosureCleaner slowing down Spark SQL queries

Guidance for becoming Spark contributor

Does Spark delete shuffle files of lost executor in running system(on YARN)?

Spark SQL - Long running job

12 matches

Site Navigation

Mail list logo

Footer information