Re: [Gluster-users] disperse volume file to subvolume mapping

Serkan Çoban Tue, 19 Apr 2016 06:17:03 -0700

>>>I assume that gluster is used to store the intermediate files before the 
>>>reduce phase
Nope, gluster is the destination for distcp command. hadoop distcp -m
50 http://nn1:8020/path/to/folder file:///mnt/gluster
This run maps on datanodes which have /mnt/gluster mounted on all of them.


>>>This means that this is caused by some peculiarity of the mapreduce.
Yes but how a client write 500 files to gluster mount and those file
just written only to subset of subvolumes? I cannot use gluster as a
backup cluster if I cannot write with distcp.

>>>You should look which files are created in each brick and how many while the 
>>>process is running.
Files only created on nodes 185..204 or 205..224 or 225..244. Only on
20 nodes in each test.

On Tue, Apr 19, 2016 at 1:05 PM, Xavier Hernandez <xhernan...@datalab.es> wrote:
> Hi Serkan,
>
> moved to gluster-users since this doesn't belong to devel list.
>
> On 19/04/16 11:24, Serkan Çoban wrote:
>>
>> I am copying 10.000 files to gluster volume using mapreduce on
>> clients. Each map process took one file at a time and copy it to
>> gluster volume.
>
>
> I assume that gluster is used to store the intermediate files before the
> reduce phase.
>
>> My disperse volume consist of 78 subvolumes of 16+4 disk each. So If I
>> copy >78 files parallel I expect each file goes to different subvolume
>> right?
>
>
> If you only copy 78 files, most probably you will get some subvolume empty
> and some other with more than one or two files. It's not an exact
> distribution, it's a statistially balanced distribution: over time and with
> enough files, each brick will contain an amount of files in the same order
> of magnitude, but they won't have the *same* number of files.
>
>> In my tests during tests with fio I can see every file goes to
>> different subvolume, but when I start mapreduce process from clients
>> only 78/3=26 subvolumes used for writing files.
>
>
> This means that this is caused by some peculiarity of the mapreduce.
>
>> I see that clearly from network traffic. Mapreduce on client side can
>> be run multi thread. I tested with 1-5-10 threads on each client but
>> every time only 26 subvolumes used.
>> How can I debug the issue further?
>
>
> You should look which files are created in each brick and how many while the
> process is running.
>
> Xavi
>
>
>>
>> On Tue, Apr 19, 2016 at 11:22 AM, Xavier Hernandez
>> <xhernan...@datalab.es> wrote:
>>>
>>> Hi Serkan,
>>>
>>> On 19/04/16 09:18, Serkan Çoban wrote:
>>>>
>>>>
>>>> Hi, I just reinstalled fresh 3.7.11 and I am seeing the same behavior.
>>>> 50 clients copying part-0-xxxx named files using mapreduce to gluster
>>>> using one thread per server and they are using only 20 servers out of
>>>> 60. On the other hand fio tests use all the servers. Anything I can do
>>>> to solve the issue?
>>>
>>>
>>>
>>> Distribution of files to ec sets is done by dht. In theory if you create
>>> many files each ec set will receive the same amount of files. However
>>> when
>>> the number of files is small enough, statistics can fail.
>>>
>>> Not sure what you are doing exactly, but a mapreduce procedure generally
>>> only creates a single output. In that case it makes sense that only one
>>> ec
>>> set is used. If you want to use all ec sets for a single file, you should
>>> enable sharding (I haven't tested that) or split the result in multiple
>>> files.
>>>
>>> Xavi
>>>
>>>
>>>>
>>>> Thanks,
>>>> Serkan
>>>>
>>>>
>>>> ---------- Forwarded message ----------
>>>> From: Serkan Çoban <cobanser...@gmail.com>
>>>> Date: Mon, Apr 18, 2016 at 2:39 PM
>>>> Subject: disperse volume file to subvolume mapping
>>>> To: Gluster Users <gluster-users@gluster.org>
>>>>
>>>>
>>>> Hi, I have a problem where clients are using only 1/3 of nodes in
>>>> disperse volume for writing.
>>>> I am testing from 50 clients using 1 to 10 threads with file names
>>>> part-0-xxxx.
>>>> What I see is clients only use 20 nodes for writing. How is the file
>>>> name to sub volume hashing is done? Is this related to file names are
>>>> similar?
>>>>
>>>> My cluster is 3.7.10 with 60 nodes each has 26 disks. Disperse volume
>>>> is 78 x (16+4). Only 26 out of 78 sub volumes used during writes..
>>>>
>>>
>
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] disperse volume file to subvolume mapping

Reply via email to