On Mon, 2010-03-29 at 11:54 -0700, Luke Kanies wrote:
> On Mar 29, 2010, at 11:31 AM, Brice Figureau wrote:
> > I started working again on #3373:
> > http://projects.reductivelabs.com/issues/3373
> >
> > I'd like to discuss here how to achieve the file client streaming (ie
> > never fully buffer the file content).
> >
> > What makes thing real complex is that the response is valid only in
> > the
> > block of the network request:
> >
> > <pseudo-code>
> >
> > http.get(...) do |response|
> > ... here response is accessible by chunk
> > end
> >
> > ... response is not accessible unless the request was buffered in RAM
> > ... which is not what we want to achieve.
> > </pseudo-code>
> >
> > In the current version, the file content indirector REST call is done
> > when something tries to access the File source property content
> > attribute. This is called by the content property at the time of
> > syncing
> > the type.
> >
> > So, we either need to reverse how the content writing works (passing
> > done a block), or use dirty thread tricks like I did to defer the http
> > request to when we'll actually write content to disk, because we
> > have to
> > pass a block (which will be called with the response chunk) that will
> > have to write the content to disk.
> >
> > My current implementation solves the issue by firing a new thread that
> > waits the file type is ready to write something to the disk and then
> > handles response chunk to the writer thread. It works, but is quite
> > dirty (and I'm sure everybody was offensed by this hardcore code to
> > the
> > point everybody was speechless :-)). For reference see
> > DeferredResponse
> > in [1]
> >
> > So, I'm seeking advice on how to write this code better, and will
> > welcome any comments or ideas.
> >
> > If you need a reference to the actual patch see here:
> > [1] http://github.com/masterzen/puppet/tree/tickets/master/3373
>
> Given that file management is already far more complicated than all of
> the other Indirector types, I'd prefer to have the 'write' method in
> File directly do the http call.
That makes sense.
> You can make the http instance easily with the
> Network::HttpPool#http_instance method, then you just have to parse
> the URL from the source.
I know,
> We could pretty easily just have a simple branch here. Here's the
> current writing line:
>
> File.open(path, File::CREAT|File::WRONLY|File::TRUNC, mode) { |f|
> f.print content }
>
> With 'content' passed in. Instead, I'd recommend changing it to
> something like:
>
> File.open(path, File::CREAT|File::WRONLY|File::TRUNC, mode) { |f|
> write_content(f) }
I'd also like to compute the on-disk checksum if needed, instead of
having to reparse the whole file to do it (that's what the current patch
does). I don't thing it saves a lot, but we're already scanning the
whole file content, it is easy to add some calls to a stream checksum.
> And then have 'write_content' switch on the content type:
>
> def write_content(fh)
> if parameter(:content).actual_content
> fh.print content
> else
> uri = URI.parse(self[:source])
> case uri.scheme
> when "puppet"; write_content_from_http(fh, uri)
> when "file"; write_content_from_file(fh, uri.path)
> else
> raise "some failure"
> end
> end
> end
>
> Just about this exact case statement is already used somewhere else
> and could likely be removed from there. At worst, you've got the
> basic structure and parsing duplicated, but my guess is we can do that
> in a way that eliminates duplication. Then we just implement the
> basic bits:
>
> def write_content_from_http(fh, uri)
> http = Puppet::Network::HttpPool.http_instance(uri.host, uri.port
> || Puppet[:masterport])
> http.get { |r| .... }
> end
>
> def write_content_from_file(fh, path)
> ...similar by-chunk copying of file....
> end
>
> Given that this doesn't require any arechitectural chagnes, and
> probably actually enables some simplification of it, why not use this?
Indeed, this is definitely better, but I didn't need a so detailed
example up to the point it's like you wrote the code :-D
> I agree it results in duplication, but I think that can mostly be
> mitigated and where it can't I think it just makes sense to pay that
> cost rather than the cost of complicating the Indirector architecture.
You're right.
I'll try to come with a new patch in a couple of weeks (I'll be AFK for
a few days next week-end).
--
Brice Figureau
Follow the latest Puppet Community evolutions on www.planetpuppet.org!
--
You received this message because you are subscribed to the Google Groups
"Puppet Developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/puppet-dev?hl=en.