I have one question : isn't it possible to abstract a bit and not depend on a given json implementation as this is still a moving target?
Regards Pascal Le 12 févr. 2014 20:30, "Paul Brown" <p...@mult.ifario.us> a écrit : > Hi, Aaron -- > > I can't speak to issues relevant to Spark, but it looks like json4s is > currently using the Jackson Scala module 2.1.3 and Scala 2.9.2. There have > been quite a few significant changes to the Scala module and underpinnings > between the 2.1.x and 2.3.x series, but I can't speak to how that interacts > with json4s. Many of those changes are convenience for direct usage of the > Jackson Scala module in binding case classes transparently, but you > wouldn't need or benefit from those through the json4s API. (FWIW, we use > Jackson Scala 2.3.2 in our Spark jobs to bind lines of JSON from text files > to case classes.) > > I'll reach out to json4s and see if I can get them to update to the 2.3.x > Jackson series and Scala 2.10, but I think it makes sense to for Spark to > just use the released version and then update when a json4s release is > available. > > Best. > -- Paul > > -- > p...@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/ > > > On Wed, Feb 12, 2014 at 10:38 AM, Aaron Davidson <ilike...@gmail.com> > wrote: > > > Will, thanks for the clarifications. I think Spark's main use-case is > > "warm, small inputs" right now, but the change seems reasonable to me > > nevertheless. > > > > Paul, do you know if there are any issues relevant to Spark that we need > > from 2.3.2? We would also have to wait for json4s to release a new > version > > that depends on 2.3.2, or else pull it in ourselves. > > > > > > On Wed, Feb 12, 2014 at 9:47 AM, Paul Brown <p...@mult.ifario.us> wrote: > > > > > And, with my FasterXML hat on, if you ask, you'll find the Jackson > folks > > > will turn around issues quickly. FWIW, there is a full-suite Jackson > > 2.3.2 > > > release rolling right up if you wait a couple of days to pull that in. > > > > > > -- Paul > > > > > > -- > > > p...@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/ > > > > > > > > > On Wed, Feb 12, 2014 at 8:12 AM, Will Benton <wi...@redhat.com> wrote: > > > > > > > ----- Original Message ----- > > > > > > > > > I am not sure I fully understand this reasoning. I imagine that > > > lift-json > > > > > is only one of hundreds of packages that would have to be built if > > you > > > > > wanted to build all of Spark's transitive dependencies from source. > > > > > > > > This is absolutely true. However, many of Spark's dependencies are > > > > already available in operating system distributions. In fact, in the > > > case > > > > I am most familiar with (packaging Spark for Fedora), Lift is the > > biggest > > > > one left that isn't already available or under review. > > > > > > > > > Additionally, to make sure I understand the impact -- this is only > > > > intended > > > > > to simplify the process of packaging Spark on a new OS distribution > > > that > > > > > disallows pulling in binaries? > > > > > > > > Yes, this was my main motivation. Since the process of building Lift > > and > > > > its transitive dependencies is disproportionately complex compared to > > how > > > > much Spark uses lift-json, I thought it would be nice to replace it > > with > > > > something that could be built as just a JSON library. I would argue > > that > > > > -- all else being equal -- it generally makes sense to make software > > > > development choices that facilitate packaging for distributions like > > > Fedora > > > > and Debian. > > > > > > > > There are other actual and potential advantages, though; here are a > > few: > > > > > > > > 1. Based on some simple timing runs I did, json4s-jackson is faster > > all > > > > around when running warm (i.e. on subsequent timing runs in the same > VM > > > or > > > > timing runs with enough iterations to last for more than a few > > seconds), > > > > slightly slower when running cold on very small parsing tasks, and > > > > significantly (~10x) faster on large parsing tasks whether cold or > > warm. > > > > The knee in the cold lift-json performance curve is somewhere > between > > > 2kb > > > > and 50kb of JSON source text. json4s-jackson is nominally faster > cold > > > with > > > > a 12kb file, 40% faster with a 50kb file, 2.6x faster with a 500kb > file > > > and > > > > 10x faster with files ranging from 4-20mb. Given how Spark uses JSON > > at > > > > the moment, the improved large-file parsing performance seems > unlikely > > to > > > > be a huge practical advantage for json4s-jackson, but it's worth > > noting. > > > > 2. The release schedule of json4s isn't coupled to the release > > schedule > > > > of a larger project. > > > > 3. json4s is intended to provide a uniform interface to Scala JSON > > > > libraries, and it provides multiple backends, which offers potential > > > > flexibility in the future. (To be fair, this interface is heavily > > based > > > on > > > > the one provided by Lift, so it would be only slightly more work to > go > > > from > > > > lift-json to json4s, as my patch does, as it would be to switch > between > > > > json4s backends.) > > > > > > > > Again, this change is primarily motivated by a desire to make life > > easier > > > > for downstream packagers, but there is no obvious downside (beyond > the > > > > downsides inherent in changing library dependencies) and several > minor > > > > advantages. > > > > > > > > > > > > best, > > > > wb > > > > > > > > > >