[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant
[ https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16076466#comment-16076466 ] Eran Moscovici commented on SPARK-21280: Thanks for your responses. Unfortunately for our use-case a structured serialization which maintains the inner structure of the object on disk, is needed (instead of melting the object into one single byte array). Encoders.bean does maintain the inner structure but BloomFilter cannot be used with Encoders.bean since it is not bean compliant. Encoders.javaSerialization works with BloomFilter but does not maintain the inner structure on disk. > org.apache.spark.util.sketch.BloomFilter not bean compliant > --- > > Key: SPARK-21280 > URL: https://issues.apache.org/jira/browse/SPARK-21280 > Project: Spark > Issue Type: Improvement > Components: Java API >Affects Versions: 2.1.1 >Reporter: Eran Moscovici >Priority: Minor > > Trying to work with Dataset fails in runtime with the 'not bean > compliant' exception. > This means that BloomFilter objects cannot be used as values to be handled > within a Spark Dataset or saved (for example as a parquet file). > One would expect an object within the Spark ecosystem > ('org.apache.spark.util.sketch.BloomFilter') to be able to do that. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant
[ https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16076094#comment-16076094 ] Eran Moscovici commented on SPARK-21280: I have a table of BloomFilters (each row an class with several BloomFilter members). Serializing/Deserializing the row class with Encoders.javaSerialization succeeds as you predicted. Since the table is also queried and each BloomFilter (one per row per column) is expected to be saved in form of it's actual byte array, the fact that one row is saved as a single byte array poses a problem. > org.apache.spark.util.sketch.BloomFilter not bean compliant > --- > > Key: SPARK-21280 > URL: https://issues.apache.org/jira/browse/SPARK-21280 > Project: Spark > Issue Type: Improvement > Components: Java API >Affects Versions: 2.1.1 >Reporter: Eran Moscovici >Priority: Minor > > Trying to work with Dataset fails in runtime with the 'not bean > compliant' exception. > This means that BloomFilter objects cannot be used as values to be handled > within a Spark Dataset or saved (for example as a parquet file). > One would expect an object within the Spark ecosystem > ('org.apache.spark.util.sketch.BloomFilter') to be able to do that. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant
[ https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074928#comment-16074928 ] Sean Owen commented on SPARK-21280: --- Not sure what you mean here. Serialization can mean serializing an entire object graph, but always means producing a single sequence of bytes as its serialized form. That's what this class already represents. What doesn't work here if you try it this way? > org.apache.spark.util.sketch.BloomFilter not bean compliant > --- > > Key: SPARK-21280 > URL: https://issues.apache.org/jira/browse/SPARK-21280 > Project: Spark > Issue Type: Improvement > Components: Java API >Affects Versions: 2.1.1 >Reporter: Eran Moscovici >Priority: Minor > > Trying to work with Dataset fails in runtime with the 'not bean > compliant' exception. > This means that BloomFilter objects cannot be used as values to be handled > within a Spark Dataset or saved (for example as a parquet file). > One would expect an object within the Spark ecosystem > ('org.apache.spark.util.sketch.BloomFilter') to be able to do that. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant
[ https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074912#comment-16074912 ] Eran Moscovici commented on SPARK-21280: Unfortunately the Encoders.javaSerialization serializes the object into one single byte array. This poses a problem when the class has multiple members, as in my case (a byte array for each member would maintain the structure of the data successfully) > org.apache.spark.util.sketch.BloomFilter not bean compliant > --- > > Key: SPARK-21280 > URL: https://issues.apache.org/jira/browse/SPARK-21280 > Project: Spark > Issue Type: Improvement > Components: Java API >Affects Versions: 2.1.1 >Reporter: Eran Moscovici >Priority: Minor > > Trying to work with Dataset fails in runtime with the 'not bean > compliant' exception. > This means that BloomFilter objects cannot be used as values to be handled > within a Spark Dataset or saved (for example as a parquet file). > One would expect an object within the Spark ecosystem > ('org.apache.spark.util.sketch.BloomFilter') to be able to do that. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant
[ https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072734#comment-16072734 ] Sean Owen commented on SPARK-21280: --- Looking at the class, I don't think it was ever intended to be used by end-user code, right? it's also not really a bean; its primary role is not merely to carry properties. So I don't think it's actually a short hop to make it a bean. However, you don't need to have a Java bean to use an Encoder. Is that all you're trying to do ? this is a Serializable class so use Encoders.javaSerialization > org.apache.spark.util.sketch.BloomFilter not bean compliant > --- > > Key: SPARK-21280 > URL: https://issues.apache.org/jira/browse/SPARK-21280 > Project: Spark > Issue Type: Improvement > Components: Java API >Affects Versions: 2.1.1 >Reporter: Eran Moscovici >Priority: Minor > > Trying to work with Dataset fails in runtime with the 'not bean > compliant' exception. > This means that BloomFilter objects cannot be used as values to be handled > within a Spark Dataset or saved (for example as a parquet file). > One would expect an object within the Spark ecosystem > ('org.apache.spark.util.sketch.BloomFilter') to be able to do that. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant
[ https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072686#comment-16072686 ] Eran Moscovici commented on SPARK-21280: One could add a public c'tor to BloomFilter, add getters and setters to all members and make the non-abstract class BloomFilterImpl truely serializable by making it's member's serializable (for example class ByteArray). What would be the fastest way to get a new version for BloomFilter and co. ? > org.apache.spark.util.sketch.BloomFilter not bean compliant > --- > > Key: SPARK-21280 > URL: https://issues.apache.org/jira/browse/SPARK-21280 > Project: Spark > Issue Type: Improvement > Components: Java API >Affects Versions: 2.1.1 >Reporter: Eran Moscovici >Priority: Minor > > Trying to work with Dataset fails in runtime with the 'not bean > compliant' exception. > This means that BloomFilter objects cannot be used as values to be handled > within a Spark Dataset or saved (for example as a parquet file). > One would expect an object within the Spark ecosystem > ('org.apache.spark.util.sketch.BloomFilter') to be able to do that. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant
[ https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16071752#comment-16071752 ] Sean Owen commented on SPARK-21280: --- There are plenty of classes that are public in the bytecode, and should be instantiable, yet not for end user use, so I don't agree with those arguments, but, still probably no big deal to add a getter or whatever. What are you proposing? > org.apache.spark.util.sketch.BloomFilter not bean compliant > --- > > Key: SPARK-21280 > URL: https://issues.apache.org/jira/browse/SPARK-21280 > Project: Spark > Issue Type: Improvement > Components: Java API >Affects Versions: 2.1.1 >Reporter: Eran Moscovici >Priority: Minor > > Trying to work with Dataset fails in runtime with the 'not bean > compliant' exception. > This means that BloomFilter objects cannot be used as values to be handled > within a Spark Dataset or saved (for example as a parquet file). > One would expect an object within the Spark ecosystem > ('org.apache.spark.util.sketch.BloomFilter') to be able to do that. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant
[ https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16071749#comment-16071749 ] Eran Moscovici commented on SPARK-21280: There is a self evident use-case for using the BloomFilter class as a value class, which is a table of BloomFilters filtering out requests for entries which don't exist in the table holding the actual data. Other use-cases can be thought of. Unfortunately adding some getter/setters to the BloomFilter class won't help since the BloomFilter class itself is abstract and uses another class for instantiation, among others. > org.apache.spark.util.sketch.BloomFilter not bean compliant > --- > > Key: SPARK-21280 > URL: https://issues.apache.org/jira/browse/SPARK-21280 > Project: Spark > Issue Type: Improvement > Components: Java API >Affects Versions: 2.1.1 >Reporter: Eran Moscovici >Priority: Minor > > Trying to work with Dataset fails in runtime with the 'not bean > compliant' exception. > This means that BloomFilter objects cannot be used as values to be handled > within a Spark Dataset or saved (for example as a parquet file). > One would expect an object within the Spark ecosystem > ('org.apache.spark.util.sketch.BloomFilter') to be able to do that. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org