[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant

2017-07-06 Thread Eran Moscovici (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16076466#comment-16076466
 ] 

Eran Moscovici commented on SPARK-21280:


Thanks for your responses.
Unfortunately for our use-case a structured serialization which maintains the 
inner structure of the object on disk, is needed (instead of melting the object 
into one single byte array).
Encoders.bean does maintain the inner structure but BloomFilter cannot be used 
with Encoders.bean since it is not bean compliant.
Encoders.javaSerialization works with BloomFilter but does not maintain the 
inner structure on disk.

> org.apache.spark.util.sketch.BloomFilter not bean compliant
> ---
>
> Key: SPARK-21280
> URL: https://issues.apache.org/jira/browse/SPARK-21280
> Project: Spark
>  Issue Type: Improvement
>  Components: Java API
>Affects Versions: 2.1.1
>Reporter: Eran Moscovici
>Priority: Minor
>
> Trying to work with Dataset fails in runtime with the 'not bean 
> compliant' exception.
> This means that BloomFilter objects cannot be used as values to be handled 
> within a Spark Dataset or saved (for example as a parquet file).
> One would expect an object within the Spark ecosystem 
> ('org.apache.spark.util.sketch.BloomFilter') to be able to do that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant

2017-07-06 Thread Eran Moscovici (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16076094#comment-16076094
 ] 

Eran Moscovici commented on SPARK-21280:


I have a table of BloomFilters (each row an class with several BloomFilter 
members). Serializing/Deserializing the row class with 
Encoders.javaSerialization succeeds as you predicted.
Since the table is also queried and each BloomFilter (one per row per column) 
is expected to be saved in form of it's actual byte array, the fact that one 
row is saved as a single byte array poses a problem.

> org.apache.spark.util.sketch.BloomFilter not bean compliant
> ---
>
> Key: SPARK-21280
> URL: https://issues.apache.org/jira/browse/SPARK-21280
> Project: Spark
>  Issue Type: Improvement
>  Components: Java API
>Affects Versions: 2.1.1
>Reporter: Eran Moscovici
>Priority: Minor
>
> Trying to work with Dataset fails in runtime with the 'not bean 
> compliant' exception.
> This means that BloomFilter objects cannot be used as values to be handled 
> within a Spark Dataset or saved (for example as a parquet file).
> One would expect an object within the Spark ecosystem 
> ('org.apache.spark.util.sketch.BloomFilter') to be able to do that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant

2017-07-05 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074928#comment-16074928
 ] 

Sean Owen commented on SPARK-21280:
---

Not sure what you mean here. Serialization can mean serializing an entire 
object graph, but always means producing a single sequence of bytes as its 
serialized form. That's what this class already represents. What doesn't work 
here if you try it this way?

> org.apache.spark.util.sketch.BloomFilter not bean compliant
> ---
>
> Key: SPARK-21280
> URL: https://issues.apache.org/jira/browse/SPARK-21280
> Project: Spark
>  Issue Type: Improvement
>  Components: Java API
>Affects Versions: 2.1.1
>Reporter: Eran Moscovici
>Priority: Minor
>
> Trying to work with Dataset fails in runtime with the 'not bean 
> compliant' exception.
> This means that BloomFilter objects cannot be used as values to be handled 
> within a Spark Dataset or saved (for example as a parquet file).
> One would expect an object within the Spark ecosystem 
> ('org.apache.spark.util.sketch.BloomFilter') to be able to do that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant

2017-07-05 Thread Eran Moscovici (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074912#comment-16074912
 ] 

Eran Moscovici commented on SPARK-21280:


Unfortunately the Encoders.javaSerialization serializes the object into one 
single byte array. This poses a problem when the class has multiple members, as 
in my case (a byte array for each member would maintain the structure of the 
data successfully)

> org.apache.spark.util.sketch.BloomFilter not bean compliant
> ---
>
> Key: SPARK-21280
> URL: https://issues.apache.org/jira/browse/SPARK-21280
> Project: Spark
>  Issue Type: Improvement
>  Components: Java API
>Affects Versions: 2.1.1
>Reporter: Eran Moscovici
>Priority: Minor
>
> Trying to work with Dataset fails in runtime with the 'not bean 
> compliant' exception.
> This means that BloomFilter objects cannot be used as values to be handled 
> within a Spark Dataset or saved (for example as a parquet file).
> One would expect an object within the Spark ecosystem 
> ('org.apache.spark.util.sketch.BloomFilter') to be able to do that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant

2017-07-03 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072734#comment-16072734
 ] 

Sean Owen commented on SPARK-21280:
---

Looking at the class, I don't think it was ever intended to be used by end-user 
code, right? it's also not really a bean; its primary role is not merely to 
carry properties. So I don't think it's actually a short hop to make it a bean.

However, you don't need to have a Java bean to use an Encoder. Is that all 
you're trying to do ? this is a Serializable class so use 
Encoders.javaSerialization

> org.apache.spark.util.sketch.BloomFilter not bean compliant
> ---
>
> Key: SPARK-21280
> URL: https://issues.apache.org/jira/browse/SPARK-21280
> Project: Spark
>  Issue Type: Improvement
>  Components: Java API
>Affects Versions: 2.1.1
>Reporter: Eran Moscovici
>Priority: Minor
>
> Trying to work with Dataset fails in runtime with the 'not bean 
> compliant' exception.
> This means that BloomFilter objects cannot be used as values to be handled 
> within a Spark Dataset or saved (for example as a parquet file).
> One would expect an object within the Spark ecosystem 
> ('org.apache.spark.util.sketch.BloomFilter') to be able to do that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant

2017-07-03 Thread Eran Moscovici (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072686#comment-16072686
 ] 

Eran Moscovici commented on SPARK-21280:


One could add a public c'tor to BloomFilter, add getters and setters to all 
members and make the non-abstract class BloomFilterImpl truely serializable by 
making it's member's serializable (for example class ByteArray).
What would be the fastest way to get a new  version for BloomFilter and co. ?

> org.apache.spark.util.sketch.BloomFilter not bean compliant
> ---
>
> Key: SPARK-21280
> URL: https://issues.apache.org/jira/browse/SPARK-21280
> Project: Spark
>  Issue Type: Improvement
>  Components: Java API
>Affects Versions: 2.1.1
>Reporter: Eran Moscovici
>Priority: Minor
>
> Trying to work with Dataset fails in runtime with the 'not bean 
> compliant' exception.
> This means that BloomFilter objects cannot be used as values to be handled 
> within a Spark Dataset or saved (for example as a parquet file).
> One would expect an object within the Spark ecosystem 
> ('org.apache.spark.util.sketch.BloomFilter') to be able to do that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant

2017-07-02 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16071752#comment-16071752
 ] 

Sean Owen commented on SPARK-21280:
---

There are plenty of classes that are public in the bytecode, and should be 
instantiable, yet not for end user use, so I don't agree with those arguments, 
but, still probably no big deal to add a getter or whatever. What are you 
proposing?

> org.apache.spark.util.sketch.BloomFilter not bean compliant
> ---
>
> Key: SPARK-21280
> URL: https://issues.apache.org/jira/browse/SPARK-21280
> Project: Spark
>  Issue Type: Improvement
>  Components: Java API
>Affects Versions: 2.1.1
>Reporter: Eran Moscovici
>Priority: Minor
>
> Trying to work with Dataset fails in runtime with the 'not bean 
> compliant' exception.
> This means that BloomFilter objects cannot be used as values to be handled 
> within a Spark Dataset or saved (for example as a parquet file).
> One would expect an object within the Spark ecosystem 
> ('org.apache.spark.util.sketch.BloomFilter') to be able to do that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21280) org.apache.spark.util.sketch.BloomFilter not bean compliant

2017-07-02 Thread Eran Moscovici (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16071749#comment-16071749
 ] 

Eran Moscovici commented on SPARK-21280:


There is a self evident use-case for using the BloomFilter class as a value 
class, which is a table of BloomFilters filtering out requests for entries 
which don't exist in the table holding the actual data. 
Other use-cases can be thought of.
Unfortunately adding some getter/setters to the BloomFilter class won't help 
since the BloomFilter class itself is abstract and uses another class for 
instantiation, among others.

> org.apache.spark.util.sketch.BloomFilter not bean compliant
> ---
>
> Key: SPARK-21280
> URL: https://issues.apache.org/jira/browse/SPARK-21280
> Project: Spark
>  Issue Type: Improvement
>  Components: Java API
>Affects Versions: 2.1.1
>Reporter: Eran Moscovici
>Priority: Minor
>
> Trying to work with Dataset fails in runtime with the 'not bean 
> compliant' exception.
> This means that BloomFilter objects cannot be used as values to be handled 
> within a Spark Dataset or saved (for example as a parquet file).
> One would expect an object within the Spark ecosystem 
> ('org.apache.spark.util.sketch.BloomFilter') to be able to do that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org