[ 
https://issues.apache.org/jira/browse/SPARK-45311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17788906#comment-17788906
 ] 

Marc Le Bihan edited comment on SPARK-45311 at 11/23/23 6:07 AM:
-----------------------------------------------------------------

I'll do a try for 3.5.0. I believe that I was forced to change from 
_{color:#000000}RowEncoder{color}.apply({color:#000000}schema{color}) (3.3.x 
and 3.4.x)_ to {color:#000000}_RowEncoder.encoderFor(schema)_ by the wish of 
the 3.5.0 version.{color}because _{color:#000000}RowEncoder{color}.apply_ 
doesn't exist anymore.


I didn't found an {{Encoders.row(...)}} method, one {{new Encoder(schema)}} was 
possible, but I found difficult to extract a {{{}ClassInfo<Row> getCls(){}}}.

I've found and changed for this, below, and it seems to reach the workaround 
you wrote about in 3.5.x. (I'll will continue to check, later) :

 
{code:java}
ExpressionEncoder<Row> encoder = ExpressionEncoder.apply(schema);

cible = cible.mapPartitions((MapPartitionsFunction<Row, Row>) it -> {
   List<Row> rows = new LinkedList<>();

   while (it.hasNext()) {
      [...]
      rows.add(RowFactory.create(valeurs.toArray()));
   }

   return rows.iterator();
}, encoder); {code}
 

 

 

 
----
I've experienced the latest 3.4.2-SNAPSHOT version available (refreshed my fork 
19h ago)
to check for the problems related to  java.util.NoSuchElementException: 
None.get and the generic types. And it improves the execution greatly.

>From 22 (or around) failing tests, for the 3.4.0 or 3.4.1,  the 3.4.2-SNAPSHOT 
>faces 4 failures only:  but they look different than before.

 
{code:java}
-------------------------------------------------------------------------------
Test set: 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT
-------------------------------------------------------------------------------
Tests run: 6, Failures: 1, Errors: 3, Skipped: 0, Time elapsed: 8.709 s <<< 
FAILURE! - in 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT
catalogueJeuxDeDonneesEtRessources Time elapsed: 1.472 s <<< ERROR!
java.lang.ClassCastException: class [Ljava.lang.Object; cannot be cast to class 
[Ljava.lang.reflect.TypeVariable; ([Ljava.lang.Object; and 
[Ljava.lang.reflect.TypeVariable; are in module java.base of loader 'bootstrap')
at 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT.catalogueJeuxDeDonneesEtRessources(CatalogueDatagouvIT.java:161)

catalogueDatasetsObjetsMetiersPagines Time elapsed: 1.03 s <<< ERROR!
java.lang.ClassCastException: class [Ljava.lang.Object; cannot be cast to class 
[Ljava.lang.reflect.TypeVariable; ([Ljava.lang.Object; and 
[Ljava.lang.reflect.TypeVariable; are in module java.base of loader 'bootstrap')
at 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT.lambda$catalogueDatasetsObjetsMetiersPagines$0(CatalogueDatagouvIT.java:105)
at 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT.catalogueDatasetsObjetsMetiersPagines(CatalogueDatagouvIT.java:105)

catalogueJeuxDeDonnees Time elapsed: 0.043 s <<< ERROR!
java.lang.ClassCastException: class [Ljava.lang.Object; cannot be cast to class 
[Ljava.lang.reflect.TypeVariable; ([Ljava.lang.Object; and 
[Ljava.lang.reflect.TypeVariable; are in module java.base of loader 'bootstrap')
at 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT.catalogueJeuxDeDonnees(CatalogueDatagouvIT.java:143)

catalogueDatasetsObjetsMetiersPaginesStreames Time elapsed: 0.534 s <<< FAILURE!
org.opentest4j.AssertionFailedError: Unexpected exception thrown: 
java.lang.ClassCastException: class [Ljava.lang.Object; cannot be cast to class 
[Ljava.lang.reflect.TypeVariable; ([Ljava.lang.Object; and 
[Ljava.lang.reflect.TypeVariable; are in module java.base of loader 'bootstrap')
at 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT.catalogueDatasetsObjetsMetiersPaginesStreames(CatalogueDatagouvIT.java:126)
Caused by: java.lang.ClassCastException: class [Ljava.lang.Object; cannot be 
cast to class [Ljava.lang.reflect.TypeVariable; ([Ljava.lang.Object; and 
[Ljava.lang.reflect.TypeVariable; are in module java.base of loader 'bootstrap')
at 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT.lambda$catalogueDatasetsObjetsMetiersPaginesStreames$3(CatalogueDatagouvIT.java:127)
at 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT.lambda$catalogueDatasetsObjetsMetiersPaginesStreames$4(CatalogueDatagouvIT.java:127)
at 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT.catalogueDatasetsObjetsMetiersPaginesStreames(CatalogueDatagouvIT.java:126){code}


was (Author: mlebihan):
I'll do a try for 3.5.0. I believe that I was forced to change from 
_{color:#000000}RowEncoder{color}.apply({color:#000000}schema{color}) (3.3.x 
and 3.4.x)_ to {color:#000000}_RowEncoder.encoderFor(schema)_ by the wish of 
the 3.5.0 version.
{color}because _{color:#000000}RowEncoder{color}.apply_ doesn't exist anymore.

Does _{color:#000000}RowEncoder{color}.apply({color:#000000}schema{color})_ had 
the same behavior than _Encoders.row(schema), btw_ ?

I'll check a second time and do few attempts. Thanks a lot.

 
----
I've experienced the latest 3.4.2-SNAPSHOT version available (refreshed my fork 
19h ago)
to check for the problems related to  java.util.NoSuchElementException: 
None.get and the generic types. And it improves the execution greatly.

>From 22 (or around) failing tests, for the 3.4.0 or 3.4.1,  the 3.4.2-SNAPSHOT 
>faces 4 failures only:  but they look different than before.

 
{code:java}
-------------------------------------------------------------------------------
Test set: 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT
-------------------------------------------------------------------------------
Tests run: 6, Failures: 1, Errors: 3, Skipped: 0, Time elapsed: 8.709 s <<< 
FAILURE! - in 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT
catalogueJeuxDeDonneesEtRessources Time elapsed: 1.472 s <<< ERROR!
java.lang.ClassCastException: class [Ljava.lang.Object; cannot be cast to class 
[Ljava.lang.reflect.TypeVariable; ([Ljava.lang.Object; and 
[Ljava.lang.reflect.TypeVariable; are in module java.base of loader 'bootstrap')
at 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT.catalogueJeuxDeDonneesEtRessources(CatalogueDatagouvIT.java:161)

catalogueDatasetsObjetsMetiersPagines Time elapsed: 1.03 s <<< ERROR!
java.lang.ClassCastException: class [Ljava.lang.Object; cannot be cast to class 
[Ljava.lang.reflect.TypeVariable; ([Ljava.lang.Object; and 
[Ljava.lang.reflect.TypeVariable; are in module java.base of loader 'bootstrap')
at 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT.lambda$catalogueDatasetsObjetsMetiersPagines$0(CatalogueDatagouvIT.java:105)
at 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT.catalogueDatasetsObjetsMetiersPagines(CatalogueDatagouvIT.java:105)

catalogueJeuxDeDonnees Time elapsed: 0.043 s <<< ERROR!
java.lang.ClassCastException: class [Ljava.lang.Object; cannot be cast to class 
[Ljava.lang.reflect.TypeVariable; ([Ljava.lang.Object; and 
[Ljava.lang.reflect.TypeVariable; are in module java.base of loader 'bootstrap')
at 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT.catalogueJeuxDeDonnees(CatalogueDatagouvIT.java:143)

catalogueDatasetsObjetsMetiersPaginesStreames Time elapsed: 0.534 s <<< FAILURE!
org.opentest4j.AssertionFailedError: Unexpected exception thrown: 
java.lang.ClassCastException: class [Ljava.lang.Object; cannot be cast to class 
[Ljava.lang.reflect.TypeVariable; ([Ljava.lang.Object; and 
[Ljava.lang.reflect.TypeVariable; are in module java.base of loader 'bootstrap')
at 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT.catalogueDatasetsObjetsMetiersPaginesStreames(CatalogueDatagouvIT.java:126)
Caused by: java.lang.ClassCastException: class [Ljava.lang.Object; cannot be 
cast to class [Ljava.lang.reflect.TypeVariable; ([Ljava.lang.Object; and 
[Ljava.lang.reflect.TypeVariable; are in module java.base of loader 'bootstrap')
at 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT.lambda$catalogueDatasetsObjetsMetiersPaginesStreames$3(CatalogueDatagouvIT.java:127)
at 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT.lambda$catalogueDatasetsObjetsMetiersPaginesStreames$4(CatalogueDatagouvIT.java:127)
at 
fr.ecoemploi.adapters.outbound.spark.dataset.datagouv.CatalogueDatagouvIT.catalogueDatasetsObjetsMetiersPaginesStreames(CatalogueDatagouvIT.java:126){code}

> Encoder fails on many "NoSuchElementException: None.get" since 3.4.x, search 
> for an encoder for a generic type, and since 3.5.x isn't "an expression 
> encoder"
> -------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-45311
>                 URL: https://issues.apache.org/jira/browse/SPARK-45311
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.4.0, 3.4.1, 3.5.0
>         Environment: Debian 12
> Java 17
> Underlying Spring-Boot 2.7.14
>            Reporter: Marc Le Bihan
>            Priority: Major
>
> If you find it convenient, you might clone the 
> [https://gitlab.com/territoirevif/minimal-tests-spark-issue] project (that 
> does many operations around cities, local authorities and accounting with 
> open data) where I've extracted from my work what's necessary to make a set 
> of 35 tests that run correctly with Spark 3.3.x, and show the troubles 
> encountered with 3.4.x and 3.5.x.
>  
> It is working well with Spark 3.2.x, 3.3.x. But as soon as I selec{*}t Spark 
> 3.4.x{*}, where the encoder seems to have deeply changed, the encoder fails 
> with two problems:
>  
> *1)* It throws *java.util.NoSuchElementException: None.get* messages 
> everywhere.
> Asking over the Internet, I wasn't alone facing this problem. Reading it, 
> you'll see that I've attempted a debug but my Scala skills are low.
> [https://stackoverflow.com/questions/76036349/encoders-bean-doesnt-work-anymore-on-a-java-pojo-with-spark-3-4-0]
> {color:#172b4d}by the way, if possible, the encoder and decoder functions 
> should forward a parameter as soon as the name of the field being handled is 
> known, and then all the long of their process, so that when the encoder is at 
> any point where it has to throw an exception, it knows the field it is 
> handling in its specific call and can send a message like:{color}
> {color:#00875a}_java.util.NoSuchElementException: None.get when encoding [the 
> method or field it was targeting]_{color}
>  
> *2)* *Not found an encoder of the type RS to Spark SQL internal 
> representation.* Consider to change the input type to one of supported at 
> (...)
> Or : Not found an encoder of the type *OMI_ID* to Spark SQL internal 
> representation (...)
>  
> where *RS* and *OMI_ID* are generic types.
> This is strange.
> [https://stackoverflow.com/questions/76045255/encoders-bean-attempts-to-check-the-validity-of-a-return-type-considering-its-ge]
>  
> *3)* When I switch to the *Spark 3.5.0* version, the same problems remain, 
> but another add itself to the list:
> "{*}Only expression encoders are supported for now{*}" on what was accepted 
> and working before.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to