On Mon, 2010-03-29 at 11:54 -0700, Luke Kanies wrote:
> On Mar 29, 2010, at 11:31 AM, Brice Figureau wrote:
> > I started working again on #3373:
> > http://projects.reductivelabs.com/issues/3373
> >
> > I'd like to discuss here how to achieve the file client streaming (ie
> > never fully buffer the file content).
> >
> > What makes thing real complex is that the response is valid only in  
> > the
> > block of the network request:
> >
> > <pseudo-code>
> >
> > http.get(...) do |response|
> >  ... here response is accessible by chunk
> > end
> >
> > ... response is not accessible unless the request was buffered in RAM
> > ... which is not what we want to achieve.
> > </pseudo-code>
> >
> > In the current version, the file content indirector REST call is done
> > when something tries to access the File source property content
> > attribute. This is called by the content property at the time of  
> > syncing
> > the type.
> >
> > So, we either need to reverse how the content writing works (passing
> > done a block), or use dirty thread tricks like I did to defer the http
> > request to when we'll actually write content to disk, because we  
> > have to
> > pass a block (which will be called with the response chunk) that will
> > have to write the content to disk.
> >
> > My current implementation solves the issue by firing a new thread that
> > waits the file type is ready to write something to the disk and then
> > handles response chunk to the writer thread. It works, but is quite
> > dirty (and I'm sure everybody was offensed by this hardcore code to  
> > the
> > point everybody was speechless :-)). For reference see  
> > DeferredResponse
> > in [1]
> >
> > So, I'm seeking advice on how to write this code better, and will
> > welcome any comments or ideas.
> >
> > If you need a reference to the actual patch see here:
> > [1] http://github.com/masterzen/puppet/tree/tickets/master/3373
> 
> Given that file management is already far more complicated than all of  
> the other Indirector types, I'd prefer to have the 'write' method in  
> File directly do the http call.

That makes sense.

> You can make the http instance easily with the  
> Network::HttpPool#http_instance method, then you just have to parse  
> the URL from the source.

I know,

> We could pretty easily just have a simple branch here.  Here's the  
> current writing line:
> 
> File.open(path, File::CREAT|File::WRONLY|File::TRUNC, mode) { |f|  
> f.print content }
> 
> With 'content' passed in.  Instead, I'd recommend changing it to  
> something like:
> 
> File.open(path, File::CREAT|File::WRONLY|File::TRUNC, mode) { |f|  
> write_content(f) }

I'd also like to compute the on-disk checksum if needed, instead of
having to reparse the whole file to do it (that's what the current patch
does). I don't thing it saves a lot, but we're already scanning the
whole file content, it is easy to add some calls to a stream checksum.

> And then have 'write_content' switch on the content type:
> 
> def write_content(fh)
>    if parameter(:content).actual_content
>      fh.print content
>    else
>      uri = URI.parse(self[:source])
>      case uri.scheme
>      when "puppet"; write_content_from_http(fh, uri)
>      when "file"; write_content_from_file(fh, uri.path)
>      else
>        raise "some failure"
>      end
>    end
> end
> 
> Just about this exact case statement is already used somewhere else  
> and could likely be removed from there.  At worst, you've got the  
> basic structure and parsing duplicated, but my guess is we can do that  
> in a way that eliminates duplication.  Then we just implement the  
> basic bits:
> 
> def write_content_from_http(fh, uri)
>    http = Puppet::Network::HttpPool.http_instance(uri.host, uri.port  
> || Puppet[:masterport])
>    http.get { |r| .... }
> end
> 
> def write_content_from_file(fh, path)
>    ...similar by-chunk copying of file....
> end
> 
> Given that this doesn't require any arechitectural chagnes, and  
> probably actually enables some simplification of it, why not use this?

Indeed, this is definitely better, but I didn't need a so detailed
example up to the point it's like you wrote the code :-D

> I agree it results in duplication, but I think that can mostly be  
> mitigated and where it can't I think it just makes sense to pay that  
> cost rather than the cost of complicating the Indirector architecture.

You're right.
I'll try to come with a new patch in a couple of weeks (I'll be AFK for
a few days next week-end).
-- 
Brice Figureau
Follow the latest Puppet Community evolutions on www.planetpuppet.org!

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-dev?hl=en.

Reply via email to