Dear Etienne, Thanks for your questions! Yes, there are ways to manipulate the manner in which PushPull achieves parallelism, check out:
http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/push _pull_framework.properties Look at the File Retrieval System related parameters. Also check out this documentation produced by Brian Foster which provides a lot of detail on how to use PushPull. http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/docu mentation/ Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Etienne Koen <[email protected]> Date: Sunday, September 14, 2014 11:42 PM To: Chris Mattmann <[email protected]>, "[email protected]" <[email protected]> Cc: Shakeh Khudikyan <[email protected]> Subject: RE: PushPull >Thanks for the information! > >Please correct me if I am wrong, PushPull in it's default operation >downloads files in parallel? Is there a way to specify any of the >parallel parameters when downloading files? For example, thread number? >Is there any way to have more control over the parallelism? > >Thanks >Etienne > >Etienne Koen >Data Processing Systems Engineer > >Space Advisory Company > >O: +27 (21) 300 0060 I C: +27 (76) 661 0170 I E: [email protected] > >________________________________________ >From: Mattmann, Chris A (3980) [[email protected]] >Sent: Friday, September 12, 2014 4:18 PM >To: Etienne Koen; [email protected] >Cc: Khudikyan, Shakeh E (398J) >Subject: Re: PushPull > >Hi Etienne, > >Thanks for your question! Yes, PushPull has parallel downloading >capability, so in terms of "pulling" data it definitely has similar >capability to GridFTP. PushPull can't initiate or "push" a transfer >like GridFTP can in that sense, so it's not exactly an apples to >apples comparison. > >For the wiki, you can sign up to create an account here: > >https://cwiki.apache.org/confluence/signup.action > >Cheers! > >Chris > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >Chris Mattmann, Ph.D. >Chief Architect >Instrument Software and Science Data Systems Section (398) >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >Office: 168-519, Mailstop: 168-527 >Email: [email protected] >WWW: http://sunset.usc.edu/~mattmann/ >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >Adjunct Associate Professor, Computer Science Department >University of Southern California, Los Angeles, CA 90089 USA >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > >-----Original Message----- >From: Etienne Koen <[email protected]> >Date: Friday, September 12, 2014 12:15 AM >To: Chris Mattmann <[email protected]>, "[email protected]" ><[email protected]> >Cc: Shakeh Khudikyan <[email protected]> >Subject: RE: PushPull > >>Hi Chris, >> >>Thank you for your response and info! I would be happy to document my >>results and would appreciate it if the community could respond to some of >>my questions I still have. >> >>At the moment it does not look like I have permissions or the >>functionality to create a page... Or I am looking at the wrong place to >>do so :-) >> >>My immediate question is whether pushpull have the parallel capability >>such as GridFTP and how to specify it for the next test phase... >> >>Cheers >> >>Etienne Koen >>Data Processing Systems Engineer >> >>Space Advisory Company >> >>O: +27 (21) 300 0060 I C: +27 (76) 661 0170 I E: [email protected] >> >>________________________________________ >>From: Mattmann, Chris A (3980) [[email protected]] >>Sent: Thursday, September 11, 2014 4:47 PM >>To: [email protected] >>Cc: Etienne Koen; Khudikyan, Shakeh E (398J) >>Subject: FW: PushPull >> >>Etienne, >> >> >>Thank you for sending this along! The crazy part about these types of >>data >>transfer studies especially with TCP/IP based protocols that aren't >>parallelized >>(e.g., FTP) is that you are limited by what's going on in the surrounding >>network. >>For example see the attached studies my team has published on data >>movement >>over the past 5-7 years and notice a similar type of behavior. Pretty >>interesting >>independent of the family of data transfer you're using. >> >>Take a look at my Dissertation too: >> >>http://sunset.usc.edu/~mattmann/Dissertation.pdf >> >>This concluded that parallel TCP/IP technologies like GridFTP (now >>GlobusOnline) >>and bbFTP performed the best across the public WAN for performance and >>efficiency >>related parameters, whereas if those aren't the overall properties you >>are >>trying >>to maximize (and instead care about good enough performance, but with >>ease >>of >>install and use - then things like WebDAV and so forth are probably good >>enough). >> >>I'd be happy to discuss your results more in general. It would be great >>if >>you >>created a wiki page here: >> >>https://cwiki.apache.org/confluence/display/OODT/Home >> >> >>To document your testing and results. Thank you and let me know! >> >>Cheers, >>Chris >> >>-----Original Message----- >>From: Etienne Koen <[email protected]> >>Date: Thursday, September 11, 2014 12:55 AM >>To: Chris Mattmann <[email protected]> >>Cc: Shakeh Khudikyan <[email protected]> >>Subject: PushPull >> >>>Hi Chris and Shakeh, >>> >>>Attached are some of the results which were performed according to the >>>baseline testing requirements. This was simply to transfer a directory >>>of >>>1GB with varying file sizes. For completeness I have gone so far as to >>>transfer files of 1MB each (This scenario might not be very probable for >>>SKA though...). I have noticed a substantiation drop in the transfer >>>rate >>>achieved compared to the 100MB files as well as the transfer rate being >>>quite variable. What would be the main contributor for this? I see that >>>there is a metadata file created for each transfer which might perhaps >>>contribute to the overhead and become quite prominent in the 1000 x 1MB >>>file case. All these tests used the FTP protocol and were performed on >>>the same machine and network link: >>> >>> >>> >>> >>> >>>For testing single file transfer I found the maximum transfer rate only >>>being achieved for files > 256 MB: >>> >>> >>> >>> >>>I also monitored the transfer rate of a 8192 MB file which constantly >>>revealed an interesting behaviour of achieving a maximum transfer rate >>>after which the transfer rate then drops. I am also unsure what the >>>cause >>>for this might be as it happened constantly and in both transfer >>>directions: >>> >>> >>> >>>I would greatly appreciate your comments on this and it include it in my >>>report before I submit it during next week. >>> >>>All the best! >>> >>>Cheers >>>Etienne >>> >>> >>> >>> >>>Etienne Koen >>>Data Processing Systems Engineer >>> >>> >>> >>> >>>Space Advisory Company >>> >>>O: +27 (21) 300 0060 I C: +27 (76) 661 0170 I E: [email protected] >>> >>> >>> >>> >>> >> >> >>________________________________ >> >>Disclaimer: This E-mail message, including any attachments, is intended >>only for the person or entity to which it is addressed, and may contain >>confidential information. Each page attached hereto must also be read in >>conjunction with this disclaimer. >>If you are not the intended recipient you are hereby notified that any >>disclosure, copying, distribution or reliance upon the contents of this >>e-mail is strictly prohibited. E.&O.E. >> >>________________________________ >> >>Disclaimer: This E-mail message, including any attachments, is intended >>only for the person or entity to which it is addressed, and may contain >>confidential information. Each page attached hereto must also be read in >>conjunction with this disclaimer. >>If you are not the intended recipient you are hereby notified that any >>disclosure, copying, distribution or reliance upon the contents of this >>e-mail is strictly prohibited. E.&O.E. > > >________________________________ > >Disclaimer: This E-mail message, including any attachments, is intended >only for the person or entity to which it is addressed, and may contain >confidential information. Each page attached hereto must also be read in >conjunction with this disclaimer. >If you are not the intended recipient you are hereby notified that any >disclosure, copying, distribution or reliance upon the contents of this >e-mail is strictly prohibited. E.&O.E. > >________________________________ > >Disclaimer: This E-mail message, including any attachments, is intended >only for the person or entity to which it is addressed, and may contain >confidential information. Each page attached hereto must also be read in >conjunction with this disclaimer. >If you are not the intended recipient you are hereby notified that any >disclosure, copying, distribution or reliance upon the contents of this >e-mail is strictly prohibited. E.&O.E.
