Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/4701#discussion_r25073470
  
    --- Diff: core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala ---
    @@ -64,8 +64,8 @@ class JdbcRDD[T: ClassTag](
         // bounds are inclusive, hence the + 1 here and - 1 on end
         val length = 1 + upperBound - lowerBound
         (0 until numPartitions).map(i => {
    -      val start = lowerBound + ((i * length) / numPartitions).toLong
    -      val end = lowerBound + (((i + 1) * length) / numPartitions).toLong - 
1
    +      val start = lowerBound + ((BigDecimal(i) * length) / 
numPartitions).toLong
    --- End diff --
    
    Ah right I see that since the bounds are used for ID-like values, it's not 
unrealistic to think that `length` is near 2^63, and it's a `long`. Hm, does 
this overflow anyway in the final partition? if `upperBound` is near 2^63 then 
this can compute an `end` that doesn't fit back into a `long` even if the 
intermediate result does. PS why not `BigInteger`?
    
    While we're fixing it though, I tend to think that @rxin is right and that 
this can just map to fixed intervals rather than perform this computation. 
    
    BTW since the range is inclusive, I feel like the last partition should 
subtract 1 from its end? and this should never be larger than `upperBound`? 
there might be a few reasons to revise this logic.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to