Re: [Openstack] Savana/Swift large object copy error
Hello Ross, On 05.05.15 21:54, Ross Lillie wrote: My understanding is that Swift should automagically split files greater that 5G into multiple segments grouped under a metafile but this appears to not be working. This was working under the Havana release (Ubuntu) using the Swift File System jar file downloaded from the Marantis web site. All current testing is based up the Juno release and when performing a distcp using the openstack-hadoop jar file shipped as part of the latest hadoop distros. I don't know the client you're using, but Swift itself (on the server side) never splits data into segments on its own, and never did so. Currently it's up to the client to ensure data is broken down into segments of max. 5GB size. There were some ideas in the past to implement this feature, however this raised different problems. Have a look at the history of large objects: http://docs.openstack.org/developer/swift/overview_large_objects.html#history I would assume that there was a change on the client side implementation that changed the behavior for you? Best Regards, Christian ___ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
[Openstack] Savana/Swift large object copy error
We're currently running Openstack Juno and are experiencing errors when performing large object copies between Hadoop HDFS and our Swift object store. While not using the Savana service directly, we are relying upon the Swift file system extension for Hadoop created as part of the Savana project. In each case, the large object copy (using Hadoop's distcp) results in Swift reporting an Error 413 - Request entity too large. As a test case, I created a 5.5 GB file of random data and tried to upload the file to Swift using Swift's CLI command. Once again Swift returned Error 413. If, however, I explicitly set a segment size on the Swift command line of 1G, then the file uploads correctly. When using Hadoop's distcp to move data from HDFS to Swift, the job always exists with Swift reporting Error 413. Explicitly setting the fs.swift.service.x.partsize does not appear to make any difference. My understanding is that Swift should automagically split files greater that 5G into multiple segments grouped under a metafile but this appears to not be working. This was working under the Havana release (Ubuntu) using the Swift File System jar file downloaded from the Marantis web site. All current testing is based up the Juno release and when performing a distcp using the openstack-hadoop jar file shipped as part of the latest hadoop distros. Has anyone else seen this behavior? Thanks, /ross -- Ross Lillie Application Software Architecture Group ___ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Re: [Openstack] Savana/Swift large object copy error
As a followup, when performing a distcp from HDFS to Swift, segments ARE being created in the swift container with a .distcp- prefix. Each temporary file appears to be related to the attempt of the map/reduce job. Just as the last temporary segment appears in the remote container, the job aborts, and of the .distcp- temporary objects are deleted and Hadoop commences to the next attempt. For example, for the currently running test case, the swift container listing shows the following: zantac:~ lillie$ swift list --lh backups 2.4G 2015-05-05 20:13:16 .distcp.tmp.attempt_1430771817173_0010_m_00_0/01 2.4G 2015-05-05 20:14:45 .distcp.tmp.attempt_1430771817173_0010_m_00_0/02 2.4G 2015-05-05 20:16:15 .distcp.tmp.attempt_1430771817173_0010_m_00_0/03 7.2G Once the entire file is copied, the operation reports Error 413, and all of the above files are deleted. It's as though the Swift file system isn't able to close the file. /ross On Tue, May 5, 2015 at 2:54 PM, Ross Lillie ross.lil...@motorolasolutions.com wrote: We're currently running Openstack Juno and are experiencing errors when performing large object copies between Hadoop HDFS and our Swift object store. While not using the Savana service directly, we are relying upon the Swift file system extension for Hadoop created as part of the Savana project. In each case, the large object copy (using Hadoop's distcp) results in Swift reporting an Error 413 - Request entity too large. As a test case, I created a 5.5 GB file of random data and tried to upload the file to Swift using Swift's CLI command. Once again Swift returned Error 413. If, however, I explicitly set a segment size on the Swift command line of 1G, then the file uploads correctly. When using Hadoop's distcp to move data from HDFS to Swift, the job always exists with Swift reporting Error 413. Explicitly setting the fs.swift.service.x.partsize does not appear to make any difference. My understanding is that Swift should automagically split files greater that 5G into multiple segments grouped under a metafile but this appears to not be working. This was working under the Havana release (Ubuntu) using the Swift File System jar file downloaded from the Marantis web site. All current testing is based up the Juno release and when performing a distcp using the openstack-hadoop jar file shipped as part of the latest hadoop distros. Has anyone else seen this behavior? Thanks, /ross -- Ross Lillie Application Software Architecture Group -- Ross Lillie Application Software Architecture Group View my calendar https://www.google.com/calendar/embed?src=ross.lillie%40motorolasolutions.comctz=America/Chicago ___ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack