We'll need to confirm that this is the actual issue, and then have it
fixed. It would be nice to have some kind of a unitest that reproduces
it.

Yehuda

On Wed, Jan 20, 2016 at 1:34 PM, seapasu...@uchicago.edu
<seapasu...@uchicago.edu> wrote:
> So is there any way to prevent this from happening going forward? I mean
> ideally this should never be possible, right? Even with a complete object
> that is 0 bytes it should be downloaded as 0 bytes and have a different
> md5sum and not report as 7mb?
>
>
>
> On 1/20/16 1:30 PM, Yehuda Sadeh-Weinraub wrote:
>>
>> On Wed, Jan 20, 2016 at 10:43 AM, seapasu...@uchicago.edu
>> <seapasu...@uchicago.edu> wrote:
>>>
>>>
>>> On 1/19/16 4:00 PM, Yehuda Sadeh-Weinraub wrote:
>>>>
>>>> On Fri, Jan 15, 2016 at 5:04 PM, seapasu...@uchicago.edu
>>>> <seapasu...@uchicago.edu> wrote:
>>>>>
>>>>> I have looked all over and I do not see any explicit mention of
>>>>> "NWS_NEXRAD_NXL2DP_PAKC_20150101110000_20150101115959" in the logs nor
>>>>> do
>>>>> I
>>>>> see a timestamp from November 4th although I do see log rotations
>>>>> dating
>>>>> back to october 15th. I don't think it's possible it wasn't logged so I
>>>>> am
>>>>> going through the bucket logs from the 'radosgw-admin log show
>>>>> --object'
>>>>> side and I found the following::
>>>>>
>>>>> 4604932         {
>>>>> 4604933             "bucket": "noaa-nexrad-l2",
>>>>> 4604934             "time": "2015-11-04 21:29:27.346509Z",
>>>>> 4604935             "time_local": "2015-11-04 15:29:27.346509",
>>>>> 4604936             "remote_addr": "",
>>>>> 4604937             "object_owner": "b05f707271774dbd89674a0736c9406e",
>>>>> 4604938             "user": "b05f707271774dbd89674a0736c9406e",
>>>>> 4604939             "operation": "PUT",
>>>>
>>>> I'd expect a multipart upload completion to be done with a POST, not a
>>>> PUT.
>>>
>>> Indeed it seems really weird.
>>>>
>>>>
>>>>> 4604940             "uri":
>>>>>
>>>>>
>>>>> "\/noaa-nexrad-l2\/2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_20150101110000_20150101115959.tar",
>>>>> 4604941             "http_status": "200",
>>>>> 4604942             "error_code": "",
>>>>> 4604943             "bytes_sent": 19,
>>>>> 4604944             "bytes_received": 0,
>>>>> 4604945             "object_size": 0,
>>>>
>>>> Do you see a zero object_size for other multipart uploads?
>>>
>>> I think so. I still don't know how to tell for certain if a radosgw
>>> object
>>> is a multipart object or not. I think all of the objects in
>>> noaa-nexrad-l2
>>> bucket are multipart::
>>>
>>> ./2015-10-16-14-default.384153.1-noaa-nexrad-l2.out-        {
>>> ./2015-10-16-14-default.384153.1-noaa-nexrad-l2.out- "bucket":
>>> "noaa-nexrad-l2",
>>> ./2015-10-16-14-default.384153.1-noaa-nexrad-l2.out- "time": "2015-10-16
>>> 19:49:30.579738Z",
>>> ./2015-10-16-14-default.384153.1-noaa-nexrad-l2.out- "time_local":
>>> "2015-10-16 14:49:30.579738",
>>> ./2015-10-16-14-default.384153.1-noaa-nexrad-l2.out- "remote_addr": "",
>>> ./2015-10-16-14-default.384153.1-noaa-nexrad-l2.out- "user":
>>> "b05f707271774dbd89674a0736c9406e",
>>> ./2015-10-16-14-default.384153.1-noaa-nexrad-l2.out: "operation": "POST",
>>> ./2015-10-16-14-default.384153.1-noaa-nexrad-l2.out- "uri":
>>>
>>> "\/noaa-nexrad-l2\/2015\/01\/13\/KGRK\/NWS_NEXRAD_NXL2DP_KGRK_20150113040000_20150113045959.tar",
>>> ./2015-10-16-14-default.384153.1-noaa-nexrad-l2.out- "http_status":
>>> "200",
>>> ./2015-10-16-14-default.384153.1-noaa-nexrad-l2.out- "error_code": "",
>>> ./2015-10-16-14-default.384153.1-noaa-nexrad-l2.out- "bytes_sent": 331,
>>> ./2015-10-16-14-default.384153.1-noaa-nexrad-l2.out- "bytes_received":
>>> 152,
>>> ./2015-10-16-14-default.384153.1-noaa-nexrad-l2.out- "object_size": 0,
>>> ./2015-10-16-14-default.384153.1-noaa-nexrad-l2.out- "total_time": 0,
>>> ./2015-10-16-14-default.384153.1-noaa-nexrad-l2.out- "user_agent":
>>> "Boto\/2.38.0 Python\/2.7.7 Linux\/2.6.32-573.7.1.el6.x86_64",
>>> ./2015-10-16-14-default.384153.1-noaa-nexrad-l2.out- "referrer": ""
>>> ./2015-10-16-14-default.384153.1-noaa-nexrad-l2.out-        }
>>>
>>> The objects above
>>> (NWS_NEXRAD_NXL2DP_KGRK_20150113040000_20150113045959.tar)
>>> pulls down without an issue though. Below is a paste for object
>>> "NWS_NEXRAD_NXL2DP_KVBX_20150225160000_20150225165959.tar" which 404's::
>>> http://pastebin.com/Jtw8z7G4
>>
>> Sadly the log doesn't provide all the input, but I can guess what the
>> operations were:
>>
>>   - POST (init multipart upload)
>>   - PUT (upload part)
>>   - GET (list parts)
>>   - POST (complete multipart) <-- took > 57 seconds to process
>>   - POST (complete multipart)
>>   - HEAD (stat object)
>>
>> For some reason the complete multipart operation took too long, which
>> I think triggered a client retry (either that, or an abort). Then
>> there were two completions racing (or a complete and abort), which
>> might have caused the issue we're seeing for some reason. E.g., two
>> completions might have ended up with the second completion noticing
>> that it's overwriting an existing object (that we just created),
>> sending the 'old' object to be garbage collected, when that object's
>> tail is actually its own tail.
>>
>>
>>> I see two posts one recorded a minute before for this object both with 0
>>> size though. Does this help at all?
>>
>> Yes, very much
>>
>> Thanks,
>> Yehuda
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to