JoshRosen commented on PR #37480: URL: https://github.com/apache/spark/pull/37480#issuecomment-1215496802
> In [SPARK-39489](https://github.com/apache/spark/pull/36885), for `Why are the changes needed?`, I found some of the reasons are as follows: > > ``` > In addition, this is a stepping-stone towards eventually being able to remove our Json4s dependency: > > Today Spark uses Json4s 3.x and this causes library conflicts for end users who want to upgrade to 4.x; see https://github.com/apache/spark/pull/33630 for one example. > To completely remove Json4s we'll need to update several other parts of Spark (including code used for ML model serialization); this PR is just a first step towards that goal if we decide to pursue it. > In this PR, I continue to use Json4s in test code; I think it's fine to keep Json4s as a test-only dependency. > ``` > > I'm not sure if @JoshRosen has plans for the next step and the overall blueprint for this. I'm just learning this [SPARK-39489](https://github.com/apache/spark/pull/36885) and trying to start with some simple cases. Similarly, I submitted another pr: #37515 @LuciferYang, my change in https://github.com/apache/spark/pull/36885 was primarily motivated by History Server performance. Although the unblocking of Json4s removal is a nice secondary benefit, I don't think that removal is a super high priority. There's also some burdens in terms of testing to ensure that the old and new JSON is fully cross-compatible (in cases where it needs to be). Aside from performance-sensitive places, there's limited benefit from _partial_ removal of Json4s: I think the big user-facing benefits would be achieved only when users are free to use any version of Json4s because Spark drops its dependency. Therefore, I think we should do some more analysis to confirm that it's technically possible to fully remove Json4s (and that doing so is safe / desirable) before we start merging these smaller piece-by-piece removals. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
