date:20150501

[jira] [Commented] (HDFS-8129) Erasure Coding: Maintain consistent naming for Erasure Coding related classes - EC/ErasureCoding

2015-05-01 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524253#comment-14524253
 ] 

Zhe Zhang commented on HDFS-8129:
-

+1 for EC. Filesystem and File System are both used, so FSXxx and FsXxx both 
look good. _Erasure_ and _Coding_ / _Code_ are always used as 2 words.

 Erasure Coding: Maintain consistent naming for Erasure Coding related classes 
 - EC/ErasureCoding
 

 Key: HDFS-8129
 URL: https://issues.apache.org/jira/browse/HDFS-8129
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
Priority: Minor

 Currently I see some classes named as ErasureCode* and some are with EC*
 I feel we should maintain consistent naming across project. This jira to 
 correct the places where we named differently to be a unique.
 And also to discuss which naming we can follow from now onwards when we 
 create new classes. 
 ErasureCoding* should be fine IMO. Lets discuss what others feel.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8311) DataStreamer.transfer() should timeout the socket InputStream.

2015-05-01 Thread Esteban Gutierrez (JIRA)

Esteban Gutierrez created HDFS-8311:
---

 Summary: DataStreamer.transfer() should timeout the socket 
InputStream.
 Key: HDFS-8311
 URL: https://issues.apache.org/jira/browse/HDFS-8311
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Reporter: Esteban Gutierrez


While validating some HA failure modes we found that HDFS clients can take a 
long time to recover or sometimes don't recover at all since we don't setup the 
socket timeout in the InputStream:

{code}
private void transfer () { ...
...
 OutputStream unbufOut = NetUtils.getOutputStream(sock, writeTimeout);
 InputStream unbufIn = NetUtils.getInputStream(sock);
...
}
{code}

The InputStream should have its own timeout in the same way as the OutputStream.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-05-01 Thread Andrew Wang (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524353#comment-14524353
]

Andrew Wang commented on HDFS-7678:
---

Nits / comment requests:

* extra imports in DFSStripedInputStream
* I noticed these config keys aren't present in hdfs-default.xml, which is
where we document the behavior of config keys. I bring it up because I was
thinking about the semantic difference between overall timeout vs. per-op
timeout. Since there are a couple EC keys that need documentation, I'm okay
punting this to a quick follow-on.
* These same keys could also use javadoc on the DFSClientConf getters so we
don't need to hunt in hdfs-default.xml or look at how the code uses the configs
* The importance of the largest read portion is not explained, nor any large
comment about the overall flow of the recovery logic. Some ascii art would help
clarify all this, I drew something like this:

{noformat}
+--+ +--+ | +--+ +--+
+--+ | | | | | | | | |
| | | | +--+ | | | | |
+--+ +--+| +--+ +--+
|
d1d2d3 | p1p2
{noformat}

This way it becomes obvious, if d1 fails, you need to lengthen the read to d3
in addition to grabbing the parity. The TODO to optimize the read also is
clarified; we always calculate the max span and always do max spans for
recovery, even though we might only need the part of d3 that's unfetched.

Related, I could see printing the plan like this to be nice for debugging.
Also a nice illustration for the test cases.

Code:
* AFAICT we don't respect deadNodes in scheduleOneStripedRead, or during the
initial planning phase. I mentioned this a bit in person, but we also need to
think about the dead node marking policy for EC reads, since the timeout is
aggressive.
* I don't understand the overall timeout handling, since if the overall timeout
has elapsed, we still go through the switch and schedule recovery reads. Seems
like something we should unit test.
* We also probably still want a per-operation timeout in addition to the
overall timeout. If the per-op timeout hits, we keep trying to recover, until
the overall timeout runs out (or we run out of blocks).
* The InterruptedException catch should probably rethrow an
InterruptedIOException or similar, the idea is to get us out of this function
ASAP while cleaning up state.

Recommended follow-on work:
* fetchBlockByteRange is an outsize function and could use some splitting. We
have this function
* There's no abstraction here for cells or stripes, we're mucking directly with
internal blocks and byte ranges. IMO all the logic should happen on these
higher-level abstractions, then we turn it into blocks and byte ranges after.
Max-span for instance is a hack, we should be calculating recovery reads
cell-by-cell based on what we already have, then combining the cells for an
internal block together to form the actual read.
* One thought regarding the threadpool. What we really want is to limit the
concurrency per-DN, i.e. a threadpool per DN. This works nicely with read
combining mentioned above, and does one better since it can coalesce recovery
reads with already queued work. This is like how disk IO queues work.

Erasure coding: DFSInputStream with decode functionality

Key: HDFS-7678
URL: https://issues.apache.org/jira/browse/HDFS-7678
Project: Hadoop HDFS
Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch,
HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch,
HDFS-7678-HDFS-7285.005.patch, HDFS-7678-HDFS-7285.006.patch,
HDFS-7678.000.patch, HDFS-7678.001.patch

A block group reader will read data from BlockGroup no matter in striping
layout or contiguous layout. The corrupt blocks can be known before
reading(told by namenode), or just be found during reading. The block group
reader needs to do decoding work when some blocks are found corrupt.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 3 4 >

1 - 100 of 309 matches

Mail list logo