[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-10-03 Thread QuentinAmbard
Github user QuentinAmbard commented on the issue: https://github.com/apache/spark/pull/21917 SPARK-25005 has actually a far better solution to detect message lost. Will try to apply same logic... --- - To

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-06 Thread koeninger
Github user koeninger commented on the issue: https://github.com/apache/spark/pull/21917 Recursively creating a Kafka RDD during creation of a Kafka RDD would need a base case, but yeah, some way to have appropriate preferred locations. On Mon, Aug 6, 2018 at 2:58 AM,

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-06 Thread koeninger
Github user koeninger commented on the issue: https://github.com/apache/spark/pull/21917 Example report of skipped offsets in a non-compacted non-transactional situation

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-06 Thread QuentinAmbard
Github user QuentinAmbard commented on the issue: https://github.com/apache/spark/pull/21917 > By failed, you mean returned an empty collection after timing out, even though records should be available? You don't. You also don't know that it isn't just lost because kafka skipped a

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-04 Thread koeninger
Github user koeninger commented on the issue: https://github.com/apache/spark/pull/21917 > How do you know that offset 4 isn't just lost because poll failed? By failed, you mean returned an empty collection after timing out, even though records should be available? You

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-04 Thread QuentinAmbard
Github user QuentinAmbard commented on the issue: https://github.com/apache/spark/pull/21917 If you are doing it in advance you'll change the range, so for example you read until 3 and don't get any extra results. Maybe it's because of a transaction offset, maybe another issue, it's

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-04 Thread koeninger
Github user koeninger commented on the issue: https://github.com/apache/spark/pull/21917 If the last offset in the range as calculated by the driver is 5, and on the executor all you can poll up to after a repeated attempt is 3, and the user already told you to

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-04 Thread QuentinAmbard
Github user QuentinAmbard commented on the issue: https://github.com/apache/spark/pull/21917 I'm not sure to understand your point. The cause of the gap doesn't matter, we just want to stop on an existing offset to be able to poll it. It can be because of a transaction marker, a

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-04 Thread koeninger
Github user koeninger commented on the issue: https://github.com/apache/spark/pull/21917 Still playing devil's advocate here, I don't think stopping at 3 in your example actually tells you anything about the cause of the gaps in the sequence at 4. I'm not sure you can know that the

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-03 Thread QuentinAmbard
Github user QuentinAmbard commented on the issue: https://github.com/apache/spark/pull/21917 With this solution we don't read the data another time "just to support transaction." The current implementation of compacted topics already ready all the messages twice in order to get a

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21917 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94058/ Test FAILed. ---

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21917 **[Test build #94058 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94058/testReport)** for PR 21917 at commit

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21917 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21917 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94056/ Test FAILed. ---

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21917 **[Test build #94056 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94056/testReport)** for PR 21917 at commit

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21917 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21917 **[Test build #94058 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94058/testReport)** for PR 21917 at commit

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21917 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21917 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94055/ Test FAILed. ---

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21917 **[Test build #94055 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94055/testReport)** for PR 21917 at commit

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21917 **[Test build #94056 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94056/testReport)** for PR 21917 at commit

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-08-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21917 **[Test build #94055 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94055/testReport)** for PR 21917 at commit

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-07-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21917 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93803/ Test PASSed. ---

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-07-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21917 **[Test build #93803 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93803/testReport)** for PR 21917 at commit

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-07-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21917 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-07-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21917 **[Test build #93803 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93803/testReport)** for PR 21917 at commit

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-07-30 Thread koeninger
Github user koeninger commented on the issue: https://github.com/apache/spark/pull/21917 jenkins, ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-07-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21917 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-07-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21917 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-07-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21917 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21917: [SPARK-24720][STREAMING-KAFKA] add option to align range...

2018-07-30 Thread holdensmagicalunicorn
Github user holdensmagicalunicorn commented on the issue: https://github.com/apache/spark/pull/21917 @QuentinAmbard, thanks! I am a bot who has found some folks who might be able to help with the review:@tdas, @zsxwing and @koeninger ---