Re: [DISCUSS]Object_Store design: S3TemplateDownloader Implementation Issues

Min Chen Tue, 04 Jun 2013 16:36:00 -0700

John,

I am trying to fix issue #2 mentioned here to handle multi-part upload using 
TransferManager, but ran into the following issue
"2013-06-04 23:06:52,626 INFO [amazonaws.http.AmazonHttpClient] 
(s3-transfer-manager-worker-1:) Unable to execute HTTP request: peer not 
authenticated
javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated"


This is part of modified code that caused this exception:

        InputStream in = !chunked ? new 
BufferedInputStream(request.getResponseBodyAsStream())

                    : new ChunkedInputStream(request.getResponseBodyAsStream());


            s_logger.info("Starting download from " + getDownloadUrl() + " to 
s3 bucket " + s3.getBucketName()

                    + " remoteSize=" + remoteSize + " , max size=" + 
maxTemplateSizeInByte);


            Date start = new Date();

            // compute s3 key

            s3Key = join(asList(installPath, fileName), S3Utils.SEPARATOR);


            // multi-part upload using S3 api to handle > 5G input stream

            AWSCredentials myCredentials = new 
BasicAWSCredentials(s3.getAccessKey(), s3.getSecretKey());

            TransferManager tm = new TransferManager(myCredentials);


            // download using S3 API

            ObjectMetadata metadata = new ObjectMetadata();

            metadata.setContentLength(remoteSize);

            PutObjectRequest putObjectRequest = new 
PutObjectRequest(s3.getBucketName(), s3Key, in, metadata)

                    .withStorageClass(StorageClass.ReducedRedundancy);

            // register progress listenser

            putObjectRequest.setProgressListener(new ProgressListener() {

                @Override

                public void progressChanged(ProgressEvent progressEvent) {

                    // s_logger.debug(progressEvent.getBytesTransfered()

                    // + " number of byte transferd "

                    // + new Date());

                    totalBytes += progressEvent.getBytesTransfered();

                    if (progressEvent.getEventCode() == 
ProgressEvent.COMPLETED_EVENT_CODE) {

                        s_logger.info("download completed");

                        status = TemplateDownloader.Status.DOWNLOAD_FINISHED;

                    } else if (progressEvent.getEventCode() == 
ProgressEvent.FAILED_EVENT_CODE) {

                        status = TemplateDownloader.Status.UNRECOVERABLE_ERROR;

                    } else if (progressEvent.getEventCode() == 
ProgressEvent.CANCELED_EVENT_CODE) {

                        status = TemplateDownloader.Status.ABORTED;

                    } else {

                        status = TemplateDownloader.Status.IN_PROGRESS;

                    }

                }


            });

            // TransferManager processes all transfers asynchronously,

            // so this call will return immediately.

            Upload upload = tm.upload(putObjectRequest);


            upload.waitForCompletion();

Can you point out what I am doing wrong here? Previous code of using low-level 
S3 putObject api does not have this issue.

Thanks
-min

On 6/3/13 11:35 AM, "Min Chen" 
<[email protected]<mailto:[email protected]>> wrote:

Hi there,

This thread is to address John's review comments on S3TemplateDownloader 
implementation. From previous thread, there are two major concerns for this 
class implementation.

1. We have used HttpClient library in this class.  For this comment, I can 
explain why I need that HttpClient during downloading object to S3. Current our 
download logic is like this:

-- get object total size and InputStream from a http url by invoking HttpClient 
library method.
-- invoke S3Utils api to download an InputStream to S3, this is totally S3 api, 
and get actual object size downloaded on S3 once completion.
-- compare object total size and actual download size to check if they are 
equal to report any truncation error.

John's concern is on step 1 above. We can get ride of HttpClient library use to 
get InputStream from an URL, but I don't know how I can easily get the object 
size from a URL. In previous email, John you mentioned that I can use S3 api 
getObjectMetaData to get the object size, but my understanding is that that API 
only applies to the object already in S3. In my flow, I need to get the size of 
object that is to be downloaded to S3, but not in S3. Willing to hear your 
suggestion here.

2. John pointed out an issue with current download method implementation in 
this class, where I have used S3 low-level api PutObjectRequest to put an 
InputStream to S3, this has a bug that it cannot handle object > 5GB. That is 
true after reading several S3 documentation on MultiPart upload, sorry that I 
am not expert on S3 and thus didn't know that earlier when I implemented this 
method. To change that, it should not take too long to code based on this 
sample on AWS 
(http://docs.aws.amazon.com/AmazonS3/latest/dev/HLTrackProgressMPUJava.html) by 
using TransferManager, just need some testing time.  IMHO, this bug should not 
become a major issue blocking object_store branch merge, just need several days 
to fix and address assuming that we have extension. Even without extension, I 
personally think that this definitely can be resolved in master with a simple 
bug fix.

Thanks
-min

Re: [DISCUSS]Object_Store design: S3TemplateDownloader Implementation Issues

Reply via email to