Github user yaooqinn commented on the pull request:

    https://github.com/apache/spark/pull/9661#issuecomment-156359312
  
    For my questions in my last comment
    ### continuous
    ```scala
    scala> import org.roaringbitmap._
    import org.roaringbitmap._
    
    scala> val r = new RoaringBitmap()
    r: org.roaringbitmap.RoaringBitmap = {}
    
    scala> for (i <- 1 to 2000000)  r.add(i)
    
    scala> r.runOptimize
    res1: Boolean = true
    
    scala> r.contains(2000000)
    res2: Boolean = true
    ```
    bits | original size | opitimized size
    ---|---|----
    200K|33,056|328
    2M|255,472|1,768
    
    
    
    ### uncontinuous
    ```scala
    scala> import org.roaringbitmap._
    import org.roaringbitmap._
    
    scala> val r = new RoaringBitmap()
    r: org.roaringbitmap.RoaringBitmap = {}
    
    scala> for (i <- 1 to 2000000) if(i%10==0) r.add(i)
    
    scala> r.runOptimize
    res1: Boolean = false
    
    scala> r.trim
    
    scala> r.contains(2000000)
    res2: Boolean = true
    ```
    bits | original size | opitimized size|trimed size
    ---|---|----|----
    200K|255,472|255,472|254,072
    2M|2,516,688|2,516,688|2,516,272
    
    ### conclusion
    In ```uncontinuous```, ```runOptimize``` failed and the size after 
```r.trim``` neither became smaller as  in ```continuous```
    
    I guess my ```separating sparse case``` idea is still necessary for 
roaringbitmap to add lesser bits incase of ```runOptimize``` failure
     


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to