Github user MikhailErofeev commented on the issue:
https://github.com/apache/spark/pull/19978
@squito
Your guess was right, and I can remove these blocks by
https://issues.apache.org/jira/browse/SPARK-20923. I will test the performance
after this patch and refine or close the
Github user MikhailErofeev commented on the issue:
https://github.com/apache/spark/pull/19978
Thanks for the constuctive feedback.
Here is my benchmark for a step of 1MB. During this run the speedup was
23%, I think there was some interference on my workstation.
```
2048
Github user squito commented on the issue:
https://github.com/apache/spark/pull/19978
agree with @srowen -- how close are we to your optimal size if we just
hardcode it to 1MB? I don't think we want the default to be 30MB.
btw, what is in your event logs that one line is 1.5
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19978
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/19978
That seems quite large, but I'm only speculating. If it defaulted to
something that's just not tiny, would that help? 1M events?
---
Github user MikhailErofeev commented on the issue:
https://github.com/apache/spark/pull/19978
I don't mind to just set it to a higher value. Moreover, the current
default value (2048) is small in any case.
For my log files, 30M buffer was the best value (a bigger one did not
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19978
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional