Le mer. 28 août 2024 à 10:17, Istvan Toth <st...@cloudera.com.invalid> a
écrit :

> On Mon, Aug 26, 2024 at 1:59 PM Rejeb Ben Rejeb <benrejebre...@gmail.com>
> wrote:
>
> > Hi,
> >
> > REMOVE DATASOURCE V1:
> > After removing V1 code, it is possible to configure V1 name as a V2 long
> > name.
> > I have to test it to be sure but this can be done by moving the
> > class PhoenixDataSource under package "org.apache.phoenix.spark".
> > In this way, it will have no impact on old applications which use the
> spark
> > API.
> >
> Would creating a compatibility child of the driver under the old package
> name work ?
> I don't like the idea of moving the up-to-date code to a new package.

I did some tests and with just moving PhoenixDataSource, I think behavior
has changed since last time I worked on a connector.
Now it needs to rename the class to DefaultSource to make it work.
The best solution will be making DefaultSource inherit from
PhoenixDataSource works to avoid moving and renaming classes.
For the spark3 connector, there is a small change to make it accept
Overwrite mode and it will behave the same as Append mode.
It's ok for me, since it is meant to maintain backward compatibility.


>
> > When I wrote my first message I forget that there is also helper methods
> > like "phoenixTableAsDataFrame" or "saveToPhoenix", for those we have two
> > options:
> >
> >    1. Assume that these methods are no longer maintained, document to use
> >    spark API instead and remove them.
> >    2. Keep methods and change method implementation to point to the V2
> >    datasource (all options of V1 are available with V2).
> >
> > Personally, I prefer option 1 as for old scala or java applications they
> > need code and dependencies update to use a newest version of connector
> > anyway. Python or R applications will not be impacted as they use Spark
> > API.
> >
> While I agree with you from a technical POV, the reality is that there are
> a lot of legacy spark jobs that I'd prefer not to break.
> Option 2 sounds better to me.
>
>
> > BUILD ARTIFACTS WITH DIFFERENT SCALA VERSIONS:
> > Yes, since the connector for spark 2 was compiled with scala 2.11 it
> can't
> > be run with spark 2 compiled with scala 2.12. Same applies for the spark
> 3
> > connector with scala 2.12 vs 2.13.
> > I meant to have this for later releases, IMHO, actually this is a
> > limitation and it will be good to have the connector built with both
> scala
> > versions so usage will not be restricted to only one version of the spark
> > build.
> > I've done some quick research, it seems that there is a way to manage
> this
> > with the scala-maven-plugin throw multiple executions instead of using
> > maven profiles.
> >
> > That sounds fine, please open a ticket, and a PR with your preferred
> solution.
>
OK I'll do it.

>
>
> > Rejeb
> >
> >
> > Le lun. 26 août 2024 à 08:49, Istvan Toth <st...@apache.org> a écrit :
> >
> > > Hi,
> > >
> > > Forgive my ignorance of Spark:
> > >
> > > REMOVE DATASOURCE V1:
> > >
> > > IIRC the V1 and V2 datasources have different names.
> > > Wouldn't this break applications using the old V1 name ?
> > > Is there a chance that this would break old applications ?
> > >
> > > BUILD ARTIFACTS WITH DIFFERENT SCALA VERSIONS:
> > >
> > > Is this required because scala 2.x runtimes are not backwards
> compatible
> > ?
> > > I don't see a problem with that.
> > >
> > > Its utility is limited until we start providing actual releases and
> > publish
> > > binary artifacts, but
> > > in theory I agree.
> > >
> > > The implementation would be a bit tricky, the solution that comes to my
> > > mind is generating the artifacts
> > > in multiple maven runs with different profiles, like we do for the
> > > different HBase profiles now.
> > >
> > > Istvan
> > >
> > > On Fri, Aug 23, 2024 at 7:36 PM Rejeb Ben Rejeb <
> benrejebre...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I would like to start a discussion about two changes to the
> > > phoenix5-spark
> > > > and phoenix5-spark3.
> > > >
> > > > REMOVE DATASOURCE V1
> > > > It is not longer necessarie to keep Datasource V1 classes, since all
> > > > features are implemented in new connector version classes. T
> > > > When fixing the issue PHOENIX-6783, I checked for impacts and done
> some
> > > > modifications to make removing the classes safe and without impacts.
> > > >
> > > > BUILD ARTIFACTS WITH DIFFERENT SCALA VERSIONS
> > > > phoenix5-spark2 connector uses spark-2.4.8 wich is available with
> scala
> > > > 2.11 and scala 2.12.
> > > > Same for phoenix5-spark3 uses spark-3.2.4 wich is available with
> scala
> > > 2.12
> > > > and scala 2.13.
> > > >
> > > > It would be nice to have connector supporting both scala version like
> > > other
> > > > connectors for exemple mongoDB or cassandra.
> > > >
> > > > Thanks,
> > > > Rejeb
> > > >
> > >
> >
> >
> > --
> > Cordialement,
> > Rejeb Ben Rejeb
> >
>
>
> --
> *István Tóth* | Sr. Staff Software Engineer
> *Email*: st...@cloudera.com
> cloudera.com <https://www.cloudera.com>
> [image: Cloudera] <https://www.cloudera.com/>
> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera
> on LinkedIn] <https://www.linkedin.com/company/cloudera>
> ------------------------------
> ------------------------------
>


-- 
Cordialement,
Rejeb Ben Rejeb

Reply via email to