On Fri, Oct 24, 2014 at 10:48 AM, Luke Kanies <l...@puppetlabs.com> wrote:
> On Oct 24, 2014, at 9:59 AM, Andy Parker <a...@puppetlabs.com> wrote: > > On Fri, Oct 24, 2014 at 2:47 AM, Erik Dalén <erik.gustav.da...@gmail.com> > wrote: > >> On 24 October 2014 03:24, Henrik Lindberg <henrik.lindb...@cloudsmith.com >> > wrote: >> >>> On 2014-24-10 2:04, Andy Parker wrote: >>> >>>> A while ago we removed support for puppet to *send* YAML on the network. >>>> At the same time we converted to using safe_yaml for receiving YAML in >>>> order to keep compatibility with existing agents. Instead of YAML all of >>>> the communication was done with PSON, which is a variant of JSON that >>>> has been in use in puppet since at least 2010. As far as I understand >>>> PSON started out as simply a vendored version of json_pure. The name >>>> PSON was apparently because rails would try to patch anything named >>>> JSON, and so they needed to name it something different to stop that >>>> from happening (that is all hearsay, so I don't know how truthful it >>>> is). >>>> >>>> Over time PSON started to evolve. Little changes were made to it here >>>> and there. The largest change came about because of >>>> http://projects.puppetlabs.com/issues/5261. The changes for that ticket >>>> removed the restriction that only valid UTF-8 could be sent in PSON, >>>> which opened the door to a) binary data as file contents and b) >>>> absolutely no control over what encodings puppet was using. Over time >>>> there have been a large number of issues that have been related to not >>>> keeping track of what encoding puppet is dealing with. >>>> >>>> I'd like to move us away from PSON and onto a standard format. YAML is >>>> out of the question because it is either slow and unsafe (all of the >>>> YAML vulnerabilities) or extremely slow and safe (safe_yaml). >>>> MessagePack might be nice. It is pretty well specified, has a fairly >>>> large number of libraries written for it, but it doesn't do much to help >>>> us solve the wild west of encoding in puppet. In MessagePack there >>>> aren't really any enforcements of string encodings and everything is >>>> treated as an array of bytes. >>>> >>>> In order to keep consistency across various puppet projects we'll be >>>> going with JSON. JSON requires that everything is valid UTF-8, which >>>> gives us a nice deliberateness to handling data. JSON is pretty fast >>>> (not as fast as MessagePack) and there are a lot of libraries if it >>>> turns out that the built in json isn't fast enough (puppet-server could >>>> use jrjackson, for instance). >>>> >>>> So what all would be changing? >>>> >>>> 1. Network communication that is using PSON would move to JSON >>>> 2. YAML files that the master and agent write would move to JSON >>>> (node, facts, last_run_summary, state, etc.). >>>> 3. A new exec node terminus would be written to handle JSON, or the >>>> existing one would be updated (check the first byte for '{'). >>>> >>>> That is just some of the changes that will need to happen. There will be >>>> a ripple of other changes based on the fact that JSON has to be UTF-8. >>>> >>>> 1. A new "encoding" parameter on File and a base64() function.. This >>>> will allow transferring non-UTF-8 data as file content until we can get >>>> a new catalog structure that allows tracking data types and more changes >>>> to the language to differentiate Strings from Blobs. >>>> >>> >>> I would like us to add a Binary datatype upfront instead of doing the >>> base64 encoding in the puppet code. Instead, it is the serialization >>> formats responsibility to transform it into a form that can be transported. >>> A JSON in text form can then do the base64 encoding. A MsgPack / JSON can >>> instead use the binary directly. >>> >>> Even if our first cut of this always performs a base64 encoding the user >>> logic does not have to change. >>> >>> Thus, instead of calling base64(content) and setting the encoding in the >>> File resource, a Binary is created directly with a binary(encoding, >>> content) function. >>> >> >> How do you differentiate between an encoded binary string and a regular >> string in the JSON though? >> You would need some sort of annotation, and if that is inside the string >> (which it is in the content parameter of files already btw) you might need >> a way to escape it to be able to have a regular string that contains that >> annotation stuff. >> > > I talked to Henrik about this and his idea is that we make file content a > special case. We write a binary() function that takes a String and produces > a hash of { "encoding" => ..., "data" => ... } (or something like that) in > the serialized form. Then the file content is written to allow either a > string or a hash of that structure. We could even implement this as a type > in the puppet language and update the serializer to do that. Perhaps we > should also create a new binary_file() function so that non-UTF-8 values > don't leak in via file(). > > > Can’t we switch file serving to just do raw downloads? Why do they even > need encoding at all? > > File serving is already done that way. We switched file buckets to that system a few releases ago as well, IIRC. The problem isn't the file server or the file bucket, but file resources in manifests that have a "content" parameter with non-UTF-8 data. > Especially if we focus on getting the static catalog to work, all file > serving turns into a plain HTTP get, and it should skip all of the Puppet > transfer, encoding, etc. > > The static compiler deals with the source parameter, not the content parameter (although it could I suppose). The current implementation also has the problem that it takes over the content parameter for another meaning, which has caught out several people (try to save a file that has content => "{md5}abdefabcdef"). > -- > http://puppetlabs.com/ | http://about.me/lak | @puppetmasterd > > -- > You received this message because you are subscribed to the Google Groups > "Puppet Developers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to puppet-dev+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/puppet-dev/FC42257B-5129-4E2D-9DED-4D5AE6888740%40puppetlabs.com > <https://groups.google.com/d/msgid/puppet-dev/FC42257B-5129-4E2D-9DED-4D5AE6888740%40puppetlabs.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- Andrew Parker a...@puppetlabs.com Freenode: zaphod42 Twitter: @aparker42 Software Developer *Join us at **PuppetConf 2015, October 5-9 in Portland, OR - * http://2015.puppetconf.com *Register early to save 40%!* -- You received this message because you are subscribed to the Google Groups "Puppet Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CANhgQXt9LpaSZXas5q6kWTzUgmAS4Qpdv-sWbqHvDx6TvADRBA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.