[GitHub] spark pull request: [SPARK-911] allow efficient queries for a rang...

JoshRosen Mon, 15 Sep 2014 10:34:08 -0700

Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/1381#issuecomment-55626881
  
    @aaronjosephs The binary search is a good idea, although I think there are 
a few subtleties involved in getting it to work generally.  Imagine that I call 
sortByKey() on an RDD and then perform a transformation that preserves 
sortedness (e.g. mapValues() or a regular filter()).  In these cases, it would 
be nice to recognize that the RDD is still sorted.  For partitioners, we have 
flags like `preservesPartitioning` for tracking which operations preserve the 
space of keys in a partition, so it might be nice to add something similar for 
other properties, such as sortedness, distinctness, etc.
    
    Personally, I feel like that might be a larger design challenge that might 
be worth deferring for a separate PR.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-911] allow efficient queries for a rang...

Reply via email to