On Mar 30, 2010, at 10:46 AM, Brice Figureau wrote:

On 29/03/10 20:54, Luke Kanies wrote:
On Mar 29, 2010, at 11:31 AM, Brice Figureau wrote:

Hi,

I started working again on #3373:
http://projects.reductivelabs.com/issues/3373

I'd like to discuss here how to achieve the file client streaming (ie
never fully buffer the file content).

What makes thing real complex is that the response is valid only in the
block of the network request:

<pseudo-code>

http.get(...) do |response|
... here response is accessible by chunk
end

... response is not accessible unless the request was buffered in RAM
... which is not what we want to achieve.
</pseudo-code>

In the current version, the file content indirector REST call is done
when something tries to access the File source property content
attribute. This is called by the content property at the time of syncing
the type.

So, we either need to reverse how the content writing works (passing
done a block), or use dirty thread tricks like I did to defer the http request to when we'll actually write content to disk, because we have to pass a block (which will be called with the response chunk) that will
have to write the content to disk.

My current implementation solves the issue by firing a new thread that
waits the file type is ready to write something to the disk and then
handles response chunk to the writer thread. It works, but is quite
dirty (and I'm sure everybody was offensed by this hardcore code to the point everybody was speechless :-)). For reference see DeferredResponse
in [1]

So, I'm seeking advice on how to write this code better, and will
welcome any comments or ideas.

If you need a reference to the actual patch see here:
[1] http://github.com/masterzen/puppet/tree/tickets/master/3373

Given that file management is already far more complicated than all of
the other Indirector types, I'd prefer to have the 'write' method in
File directly do the http call.

You can make the http instance easily with the
Network::HttpPool#http_instance method, then you just have to parse the
URL from the source.

We could pretty easily just have a simple branch here.  Here's the
current writing line:

File.open(path, File::CREAT|File::WRONLY|File::TRUNC, mode) { |f|
f.print content }

With 'content' passed in.  Instead, I'd recommend changing it to
something like:

File.open(path, File::CREAT|File::WRONLY|File::TRUNC, mode) { |f|
write_content(f) }

And then have 'write_content' switch on the content type:

def write_content(fh)
 if parameter(:content).actual_content
   fh.print content
 else
   uri = URI.parse(self[:source])
   case uri.scheme
   when "puppet"; write_content_from_http(fh, uri)
   when "file"; write_content_from_file(fh, uri.path)
   else
     raise "some failure"
   end
 end
end

Just about this exact case statement is already used somewhere else and
could likely be removed from there.  At worst, you've got the basic
structure and parsing duplicated, but my guess is we can do that in a
way that eliminates duplication. Then we just implement the basic bits:

def write_content_from_http(fh, uri)
 http = Puppet::Network::HttpPool.http_instance(uri.host, uri.port ||
Puppet[:masterport])
 http.get { |r| .... }
end

def write_content_from_file(fh, path)
 ...similar by-chunk copying of file....
end

Given that this doesn't require any arechitectural chagnes, and probably
actually enables some simplification of it, why not use this?

The issue here is that Puppet::Type::File#write is passed the content.
This content can come from apparently 3 possible sources:
* content#sync, with a given content
* content#sync, with a given source content (which triggers the file
download)
* ensure#sync, with ""

So, if we remove the content parameter, that means this logic will now
be embedded in File#write, and all the sync property methods will only
call write (with their property name if that's something that matters).
I don't have myself any objections about this change, but maybe you'd
have some?

The content source actually gets dramatically simplified in the testing branch, when I convert the 'checksum' property into a parameter. Basically 'content' completely owns the content, other than, maybe, the 'ensure' property doing that empty write which can easily be worked around.

I actually meant to move the 'write' method into 'content', but I forgot to.

In the end, though, yes, I'm completely comfortable with that change, and I think it's a good idea. And if you happen to move the write method into 'content' in the mean time, I won't object. :)

--
Now and then an innocent man is sent to the legislature.
    --Kin Hubbard
---------------------------------------------------------------------
Luke Kanies  -|-   http://puppetlabs.com   -|-   +1(615)594-8199

--
You received this message because you are subscribed to the Google Groups "Puppet 
Developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-dev?hl=en.

Reply via email to