David, It is worth mentioning that this is only an issue since we started to 'pre-compress' the files. These files no-longer use mod_deflate. I store them with the suffix .gzb and add the following header to httpd.conf:
AddEncoding gzip gzb libCurl then detects the gzip format and decompresses it during transfer. If you are using the techniques outlined at http://boinc.berkeley.edu/trac/wiki/FileCompression then a partial transfer will work correctly since mod_deflate will handle the range correctly using the uncompressed file size. I am wondering if I might have been incorrect to open that bug. A fourth option might be to ignore the request for a range transfer and instead return the whole file with a http 200 code for these file types. "6.4.2 Range Retrieval Requests HTTP retrieval requests using conditional or unconditional GET methods MAY request one or more sub-ranges of the entity, instead of the entire entity, using the Range request header, which applies to the entity returned as the result of the request: Range = "Range" ":" ranges-specifier A server MAY ignore the Range header. However, HTTP/1.1 origin servers and intermediate caches ought to support byte ranges when possible, since Range supports efficient recovery from partially failed transfers, and supports efficient partial retrieval of large entities. If the server supports the Range header and the specified range or ranges are appropriate for the entity: The presence of a Range header in an unconditional GET modifies what is returned if the GET is otherwise successful. In other words, the response carries a status code of 206 (Partial Content) instead of 200 (OK). " I.e. I should add RequestHeader unset Range so that apache will not process the range request. However, you can not set conditions on the RequestHeader like you can with Header so this would take effect for all download requests (we don't do this for all of our projects). Any ideas out there? thanks, Kevin, |------------> | From: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |David Anderson <[email protected]> | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | To: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |BOINC Developers Mailing List <[email protected]> | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Cc: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |Carl Christensen <[email protected]>, Kevin Reed/Chicago/i...@ibmus | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Date: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |06/22/2009 03:14 PM | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Subject: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |[Fwd: [BOINC] #924: Partial Transfer Fails] | >--------------------------------------------------------------------------------------------------------------------------------------------------| It's not clear to me how to solve this problem. Synopsis: - projects can configure their web server to send files in a compressed form to clients that can handle it; see http://boinc.berkeley.edu/trac/wiki/FileCompression In modern (>= 5.4) BOINC clients, the client tells libCurl to handle such compression. - The client keeps track of the bytes transferred N, and if a transfer fails and is retried, it sends a Range: header telling the web server to skip the first N bytes. NOTE: N is a count of uncompressed bytes; as far as I can tell, libCurl tells us only how many uncompressed bytes it's read. But when send Range:N to the server, it skips N compressed bytes, which causes the error Kevin points out below. I can think of 3 solutions, 2 of which I don't know how to implement: 1) If a file is compressed, always start the transfer from the beginning. Problem: how does the client know if a file is compressed? This is negotiated between libCurl and the server. 2) Keep track of the number M of compressed bytes transferred, and tell the server to skip M bytes on retries. Problems: a) libCurl doesn't tell us M, as far as I can tell. b) I'm not sure that Apache supports this. 3) Allow a <no_partial_transfer> flag in <file_info> elements, telling the client to always start transfers from the beginning. Does anyone know how to implement 1) or 2), or have other ideas? If not, should I go ahead and implement 3)? -- David -------- Original Message -------- Subject: [BOINC] #924: Partial Transfer Fails Date: Mon, 22 Jun 2009 15:59:20 -0000 From: BOINC <[email protected]> Reply-To: [email protected] #924: Partial Transfer Fails -----------------------------+---------------------------------------------- Reporter: knreed | Owner: davea Type: Defect | Status: new Priority: Major | Milestone: 6.10 Component: Client - Daemon | Version: 6.6.31 Keywords: | -----------------------------+---------------------------------------------- If a project compresses files and relies on libCurl to decompress the file during transfer, then in the event that a transfer is interrupted, BOINC relies on the decompressed file size to set the Request Range header. This will be invalid and therefore fail with: 416 Requested Range Not Satisfiable this will repeat until the timeout limit for the download. It would be great if in the event of the failure of a partial file download, the client would then attempt a full download. -- Ticket URL: <?http://boinc.berkeley.edu/trac/ticket/924> BOINC <http://boinc.berkeley.edu> Berkeley Open Infrastructure for Network Computing (BOINC)
<<inline: graycol.gif>>
<<inline: ecblank.gif>>
_______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
