[GitHub] spark issue #19978: [SPARK-22784][CORE] Configure reading buffer size in Spa...

2017-12-16 Thread MikhailErofeev
Github user MikhailErofeev commented on the issue: https://github.com/apache/spark/pull/19978 @squito Your guess was right, and I can remove these blocks by https://issues.apache.org/jira/browse/SPARK-20923. I will test the performance after this patch and refine or close the

[GitHub] spark issue #19978: [SPARK-22784][CORE] Configure reading buffer size in Spa...

2017-12-15 Thread MikhailErofeev
Github user MikhailErofeev commented on the issue: https://github.com/apache/spark/pull/19978 Thanks for the constuctive feedback. Here is my benchmark for a step of 1MB. During this run the speedup was 23%, I think there was some interference on my workstation. ``` 2048

[GitHub] spark issue #19978: [SPARK-22784][CORE] Configure reading buffer size in Spa...

2017-12-14 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/19978 agree with @srowen -- how close are we to your optimal size if we just hardcode it to 1MB? I don't think we want the default to be 30MB. btw, what is in your event logs that one line is 1.5

[GitHub] spark issue #19978: [SPARK-22784][CORE] Configure reading buffer size in Spa...

2017-12-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19978 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19978: [SPARK-22784][CORE] Configure reading buffer size in Spa...

2017-12-14 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19978 That seems quite large, but I'm only speculating. If it defaulted to something that's just not tiny, would that help? 1M events? ---

[GitHub] spark issue #19978: [SPARK-22784][CORE] Configure reading buffer size in Spa...

2017-12-14 Thread MikhailErofeev
Github user MikhailErofeev commented on the issue: https://github.com/apache/spark/pull/19978 I don't mind to just set it to a higher value. Moreover, the current default value (2048) is small in any case. For my log files, 30M buffer was the best value (a bigger one did not

[GitHub] spark issue #19978: [SPARK-22784][CORE] Configure reading buffer size in Spa...

2017-12-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19978 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional