from:"Hari Mankude \(JIRA\)"

[jira] [Commented] (HDFS-7056) Snapshot support for truncate

2014-10-14 Thread Hari Mankude (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14171346#comment-14171346
 ] 

Hari Mankude commented on HDFS-7056:


Is the proposal to copy the list of blocks to snapshot copy only when the file 
is truncated or is it when snapshot is taken irrespective of whether file is 
truncated or not?

After the file is truncated and then appended again, will all subsequent 
snapshots of the file get a copy of block list?

 Snapshot support for truncate
 -

 Key: HDFS-7056
 URL: https://issues.apache.org/jira/browse/HDFS-7056
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko

 Implementation of truncate in HDFS-3107 does not allow truncating files which 
 are in a snapshot. It is desirable to be able to truncate and still keep the 
 old file state of the file in the snapshot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5427) not able to read deleted files from snapshot directly under snapshottable dir after checkpoint and NN restart

2013-11-05 Thread Hari Mankude (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814338#comment-13814338
 ] 

Hari Mankude commented on HDFS-5427:


Is this patch going to be backported to 2.2 also?

 not able to read deleted files from snapshot directly under snapshottable dir 
 after checkpoint and NN restart
 -

 Key: HDFS-5427
 URL: https://issues.apache.org/jira/browse/HDFS-5427
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Vinay
Assignee: Vinay
Priority: Blocker
 Fix For: 2.3.0

 Attachments: HDFS-5427-v2.patch, HDFS-5427.patch, HDFS-5427.patch


 1. allow snapshots under dir /foo
 2. create a file /foo/bar
 3. create a snapshot s1 under /foo
 4. delete the file /foo/bar
 5. wait till checkpoint or do saveNameSpace
 6. restart NN.
 7. Now try to read the file from snapshot /foo/.snapshot/s1/bar
 client will get BlockMissingException
 Reason is 
 While loading the deleted file list for a snashottable dir from fsimage, 
 blocks were not updated in blocksmap



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-4872) Idempotent delete operation.

2013-06-05 Thread Hari Mankude (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675915#comment-13675915
 ] 

Hari Mankude commented on HDFS-4872:


modTime cannot be used as unique parameter due to race between delete and 
append. 

If hdfs had cTime, it would have been a unique file specific value in 
combination with file path. 

 Idempotent delete operation.
 

 Key: HDFS-4872
 URL: https://issues.apache.org/jira/browse/HDFS-4872
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.4-alpha
Reporter: Konstantin Shvachko

 Making delete idempotent is important to provide uninterrupted job execution 
 in case of HA failover.
 This is to discuss different approaches to idempotent implementation of 
 delete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-4873) callGetBlockLocations returns incorrect number of blocks for snapshotted files

2013-06-03 Thread Hari Mankude (JIRA)

Hari Mankude created HDFS-4873:
--

 Summary: callGetBlockLocations returns incorrect number of blocks 
for snapshotted files
 Key: HDFS-4873
 URL: https://issues.apache.org/jira/browse/HDFS-4873
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0
Reporter: Hari Mankude
Assignee: Jing Zhao


callGetBlockLocations() returns all the blocks of a file even when they are not 
present in the snap version

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4873) callGetBlockLocations returns incorrect number of blocks for snapshotted files

2013-06-03 Thread Hari Mankude (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673558#comment-13673558
 ] 

Hari Mankude commented on HDFS-4873:


The sequence of operations for creating the problem

1. create a file of size one block
2. take a snapshot
3. append some data to this file.
4. use DfsClient.callGetBlockLocations() to get block locations of the snapshot 
version of the file. The file len is specified as Long.MAX_VALUE.
5. This call returns two LocatedBlocks for the snapshot version of the file 
instead of one block.

 callGetBlockLocations returns incorrect number of blocks for snapshotted files
 --

 Key: HDFS-4873
 URL: https://issues.apache.org/jira/browse/HDFS-4873
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0
Reporter: Hari Mankude
Assignee: Jing Zhao

 callGetBlockLocations() returns all the blocks of a file even when they are 
 not present in the snap version

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4873) callGetBlockLocations returns incorrect number of blocks for snapshotted files

2013-06-03 Thread Hari Mankude (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673562#comment-13673562
 ] 

Hari Mankude commented on HDFS-4873:


Looks like the problem is in getBlockLocationsUpdateTimes() where length is not 
truncated to fileSize before calling createLocatedBlocks(). There are other 
solutions possible if snap inode is passed in.

 callGetBlockLocations returns incorrect number of blocks for snapshotted files
 --

 Key: HDFS-4873
 URL: https://issues.apache.org/jira/browse/HDFS-4873
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0
Reporter: Hari Mankude
Assignee: Jing Zhao

 callGetBlockLocations() returns all the blocks of a file even when they are 
 not present in the snap version

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4817) make HDFS advisory caching configurable on a per-file basis

2013-05-17 Thread Hari Mankude (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660724#comment-13660724
]

Hari Mankude commented on HDFS-4817:

Colin,

Can this feature be extended to determine where data needs to be stored in DN?
For example, a DN might have SSDs and SATA/SAS drives and depending on hints
provided by the user on the access patterns (random reads vs long sequential
reads), it might be useful to put the data in SSDs vs SATA. I understand that
NN has to be involved to make this information persistent during block
relocation.

The nice goal would be to make DN smarter (or have the ability to learn with
minimal involvement from NN) than what it is doing right now given that nodes
can have storage devices with vastly different characteristics. Another option
is to use access patterns to move data across various storages in DN. [sort of
HSM]

It looks like current patch is mainly to manage the OS pagecache.

make HDFS advisory caching configurable on a per-file basis
---

Key: HDFS-4817
URL: https://issues.apache.org/jira/browse/HDFS-4817
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs-client
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
Attachments: HDFS-4817.001.patch

HADOOP-7753 and related JIRAs introduced some performance optimizations for
the DataNode. One of them was readahead. When readahead is enabled, the
DataNode starts reading the next bytes it thinks it will need in the block
file, before the client requests them. This helps hide the latency of
rotational media and send larger reads down to the device. Another
optimization was drop-behind. Using this optimization, we could remove
files from the Linux page cache after they were no longer needed.
Using {{dfs.datanode.drop.cache.behind.writes}} and
{{dfs.datanode.drop.cache.behind.reads}} can improve performance
substantially on many MapReduce jobs. In our internal benchmarks, we have
seen speedups of 40% on certain workloads. The reason is because if we know
the block data will not be read again any time soon, keeping it out of memory
allows more memory to be used by the other processes on the system. See
HADOOP-7714 for more benchmarks.
We would like to turn on these configurations on a per-file or per-client
basis, rather than on the DataNode as a whole. This will allow more users to
actually make use of them. It would also be good to add unit tests for the
drop-cache code path, to ensure that it is functioning as we expect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4817) make HDFS advisory caching configurable on a per-file basis

2013-05-17 Thread Hari Mankude (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660880#comment-13660880
]

Hari Mankude commented on HDFS-4817:

I would look at the patch as an ability for the user to provide hints to DN
regarding the access patterns (random reads/sequential read/write once
only/multiple access etc). It is incidental that these hints are currently used
to manage pagecache. The same hints or similar hints can be used for moving
blocks to different storage tiers at DN.

Another suggestion that I had is to provide a fadvise() like interface on the
iostream that a user can use to send hints.

I am aware of hfds-4672. It is a complicated and correct way of managing
storage pools.

make HDFS advisory caching configurable on a per-file basis
---

[jira] [Commented] (HDFS-4750) Support NFSv3 interface to HDFS

2013-04-27 Thread Hari Mankude (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13643734#comment-13643734
]

Hari Mankude commented on HDFS-4750:

I would recommend thinking through NFS write operations. The client does
caching and page cache can result in lots of weirdness. For example, as long as
the data is cached in client's page cache, client can do random writes and
overwrites. When page cache is flushed to hdfs data store, some writes would
fail (translate to overwrites in hdfs) while others might succeed (offsets
happen to be append).

An alternative to consider to support NFS writes is to require clients do NFS
mounts with directio enabled. Directio will bypass client cache and might
alleviate some of the funky behavior.

Support NFSv3 interface to HDFS
---

Key: HDFS-4750
URL: https://issues.apache.org/jira/browse/HDFS-4750
Project: Hadoop HDFS
Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Brandon Li
Assignee: Brandon Li
Attachments: HADOOP-NFS-Proposal.pdf

Access HDFS is usually done through HDFS Client or webHDFS. Lack of seamless
integration with client’s file system makes it difficult for users and
impossible for some applications to access HDFS. NFS interface support is one
way for HDFS to have such easy integration.
This JIRA is to track the NFS protocol support for accessing HDFS. With HDFS
client, webHDFS and the NFS interface, HDFS will be easier to access and be
able support more applications and use cases.
We will upload the design document and the initial implementation.

[jira] [Commented] (HDFS-4750) Support NFSv3 interface to HDFS

2013-04-25 Thread Hari Mankude (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642102#comment-13642102
]

Hari Mankude commented on HDFS-4750:

Implementing writes might not be easy. The client implementations in various
kernels does not guarantee that the writes are issued in sequential order. Page
flushing algorithms try to find contiguous pages (offsets). However, there are
other factors in play with page flushing algorithms. So it does not imply that
writes from the client has to be sequential as HDFS requires it to be. This is
true whether the writes are coming in lazily from the client or due to a sync()
before close(). A possible solution is for nfs gateway on dfs client to cache
and reorder the writes to be sequential. But, this might still result in
holes which hdfs cannot handle. Also, the cache requirements might not be
trivial and might require a flush to local disk.

NFS interfaces are very useful for reads.

Support NFSv3 interface to HDFS
---

[jira] [Commented] (HDFS-4758) Disallow nested snapshottable directories

2013-04-25 Thread Hari Mankude (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642270#comment-13642270
 ] 

Hari Mankude commented on HDFS-4758:


Actually one use case of nested snapshots that I see is that user might have 
different backup policies for /user (once every day) and /user/hive (every 8 
hrs). When backing up /user, it is possible to setup exclusions of /user/hive 
directory so that two copies of /user/hive is not made. However, if snapshots 
cannot be taken of /user and /user/hive at the same time, it would be a 
disadvantage.

 Disallow nested snapshottable directories
 -

 Key: HDFS-4758
 URL: https://issues.apache.org/jira/browse/HDFS-4758
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE

 Nested snapshottable directories are supported by the current implementation. 
  However, it seems that there are no good use cases for nested snapshottable 
 directories.  So we disable it for now until someone has a valid use case for 
 it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4758) Disallow nested snapshottable directories

2013-04-25 Thread Hari Mankude (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642316#comment-13642316
 ] 

Hari Mankude commented on HDFS-4758:


The trade-off is between usability vs complexity. In this case, it might result 
in issues where a user has taken a snapshot of /user/foo/dir1 and admin finds 
that system-wide snaps cannot be taken at say /user dir levels since there are 
several users with their snapshots at lower directories. This might limit the 
usability of the feature.

 Disallow nested snapshottable directories
 -

 Key: HDFS-4758
 URL: https://issues.apache.org/jira/browse/HDFS-4758
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE

 Nested snapshottable directories are supported by the current implementation. 
  However, it seems that there are no good use cases for nested snapshottable 
 directories.  So we disable it for now until someone has a valid use case for 
 it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2576) Namenode should have a favored nodes hint to enable clients to have control over block placement.

2013-03-08 Thread Hari Mankude (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597548#comment-13597548
]

Hari Mankude commented on HDFS-2576:

Is data skew going to be an issue where some DNs are overloaded vs other DNs?
Would this an issue when there is other data stored in hdfs along with hbase?

Namenode should have a favored nodes hint to enable clients to have control
over block placement.
-

Key: HDFS-2576
URL: https://issues.apache.org/jira/browse/HDFS-2576
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Pritam Damania
Attachments: hdfs-2576-1.txt, hdfs-2576-trunk-1.patch

Sometimes Clients like HBase are required to dynamically compute the
datanodes it wishes to place the blocks for a file for higher level of
locality. For this purpose there is a need of a way to give the Namenode a
hint in terms of a favoredNodes parameter about the locations where the
client wants to put each block. The proposed solution is a favored nodes
parameter in the addBlock() method and in the create() file method to enable
the clients to give the hints to the NameNode about the locations of each
replica of the block. Note that this would be just a hint and finally the
NameNode would look at disk usage, datanode load etc. and decide whether it
can respect the hints or not.

[jira] [Commented] (HDFS-4087) Protocol changes for listSnapshots functionality

2013-02-21 Thread Hari Mankude (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13583360#comment-13583360
 ] 

Hari Mankude commented on HDFS-4087:


Has the listSnap cli call been added?

 Protocol changes for listSnapshots functionality
 

 Key: HDFS-4087
 URL: https://issues.apache.org/jira/browse/HDFS-4087
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: Snapshot (HDFS-2802)
Reporter: Brandon Li
Assignee: Brandon Li
  Labels: needs-test
 Fix For: Snapshot (HDFS-2802)

 Attachments: HDFS-4087.patch, HDFS-4087.patch, HDFS-4087.patch, 
 HDFS-4087.patch


 SnapInfo saves information about a snapshot. This jira also updates the java 
 protocol classes and translation for listSnapshot operation.
 Given a snapshot root, the snapshots create under it can be listed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-22 Thread Hari Mankude (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13481669#comment-13481669
]

Hari Mankude commented on HDFS-2802:

Nicholas is right in that we do start off with O(1) memory usage. But depending
on writes and updates on the base filesystem, memory usage for snapshot will
increase. The worst case is when an application updates all the files in the
snapshotted subtree. Even in this scenario, the snap inodes are minimized
versions of the actual file inode and retain only the relevant information for
snapshots. Additionally (in the prototype), if multiple snapshots are taken of
the same subtree, then significant optimizations are done to reduce the memory
footprint by representing more than one snapshot in a single snapINode.

Support for RW/RO snapshots in HDFS
---

Key: HDFS-2802
URL: https://issues.apache.org/jira/browse/HDFS-2802
Project: Hadoop HDFS
Issue Type: New Feature
Components: data-node, name-node
Reporter: Hari Mankude
Assignee: Hari Mankude
Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf

Snapshots are point in time images of parts of the filesystem or the entire
filesystem. Snapshots can be a read-only or a read-write point in time copy
of the filesystem. There are several use cases for snapshots in HDFS. I will
post a detailed write-up soon with with more information.

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-20 Thread Hari Mankude (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480745#comment-13480745
]

Hari Mankude commented on HDFS-2802:

Todd, another option is to look at the inodesUnderConstruction in the NN and
query the DNs for the exact filesize at the time of taking snapshot. Even with
this, the filesize that is obtained will be at the instant. Applications like
hbase will have to deal with hlogs that could have incomplete log entries when
an un-cordinated snapshot is taken at the hdfs. A better approach is to have
the application reach a quiesce point and then take a snap. This is normally
done for oracle (hot backup mode) and sqlserver so that an application
consistent snapshot can be taken.

Also, createSnap()/removeSnap() has the writeLock() on the FSNamesystem which
will ensure that there are no other metadata updates when snap is being taken.

Support for RW/RO snapshots in HDFS
---

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-20 Thread Hari Mankude (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480784#comment-13480784
]

Hari Mankude commented on HDFS-2802:

Todd,

I do not agree that your solution will be any beneficial to hbase than what is
being proposed. Any type of txid information in DNs will be at the beginning of
the transaction. If the client is writing in the middle of block, there is no
way to know the exact size when snap was taken. Querying
inodesUnderConstruction will give the block length at the time of the query. It
is not possible to take an application consistent snapshot (one which does not
require recovery) without coordination with the application.

In fact, communication with DNs when snapshots are being taken will make the
process of taking snapshots very slow while giving very little additional
benefit.

Support for RW/RO snapshots in HDFS
---

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-20 Thread Hari Mankude (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480785#comment-13480785
 ] 

Hari Mankude commented on HDFS-2802:


Sorry hit the comment early.

Additionally, including the sizes of non-finalized blocks in snapshots has 
implication that if the client dies and the non-finalized section is discarded, 
then snapshot might have pointers to non-existent blocks.



 Support for RW/RO snapshots in HDFS
 ---

 Key: HDFS-2802
 URL: https://issues.apache.org/jira/browse/HDFS-2802
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, name-node
Reporter: Hari Mankude
Assignee: Hari Mankude
 Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf


 Snapshots are point in time images of parts of the filesystem or the entire 
 filesystem. Snapshots can be a read-only or a read-write point in time copy 
 of the filesystem. There are several use cases for snapshots in HDFS. I will 
 post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-07-12 Thread Hari Mankude (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412969#comment-13412969
 ] 

Hari Mankude commented on HDFS-2802:


A quick user's guide 

hadoop dfsadmin -createsnap snapname path where snap is to be taken ro/rw 
will create a snap with snapname at the location mentioned

hadoop dfsadmin -removesnap snapname will remove snapshot

hadoop dfsadmin -listsnap / will list all snaps that have been taken under / 

 Support for RW/RO snapshots in HDFS
 ---

 Key: HDFS-2802
 URL: https://issues.apache.org/jira/browse/HDFS-2802
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, name-node
Affects Versions: 0.24.0
Reporter: Hari Mankude
Assignee: Hari Mankude
 Attachments: snap.patch, snapshot-one-pager.pdf


 Snapshots are point in time images of parts of the filesystem or the entire 
 filesystem. Snapshots can be a read-only or a read-write point in time copy 
 of the filesystem. There are several use cases for snapshots in HDFS. I will 
 post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-07-11 Thread Hari Mankude (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412508#comment-13412508
]

Hari Mankude commented on HDFS-2802:

I am attaching an early version of the patch based on the trunk. It took longer
than I expected to rebase the patch. Code needs cleanup, further optimization
of memory usage in the NN, fixes to checkpointing code to handle some border
conditions and more tests. Next steps - Working on splitting the patch into
smaller and easy to review code. Branch HDFS-2802 has been created for this
work. Next version of the design document will be posted soon. (some we
discussed during HDFS meetup).

Support for RW/RO snapshots in HDFS
---

Key: HDFS-2802
URL: https://issues.apache.org/jira/browse/HDFS-2802
Project: Hadoop HDFS
Issue Type: New Feature
Components: data-node, name-node
Affects Versions: 0.24.0
Reporter: Hari Mankude
Assignee: Hari Mankude
Attachments: snapshot-one-pager.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-07-11 Thread Hari Mankude (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Mankude updated HDFS-2802:
---

Attachment: snap.patch

 Support for RW/RO snapshots in HDFS
 ---

 Key: HDFS-2802
 URL: https://issues.apache.org/jira/browse/HDFS-2802
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, name-node
Affects Versions: 0.24.0
Reporter: Hari Mankude
Assignee: Hari Mankude
 Attachments: snap.patch, snapshot-one-pager.pdf


 Snapshots are point in time images of parts of the filesystem or the entire 
 filesystem. Snapshots can be a read-only or a read-write point in time copy 
 of the filesystem. There are several use cases for snapshots in HDFS. I will 
 post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3370) HDFS hardlink

2012-05-12 Thread Hari Mankude (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13274035#comment-13274035
 ] 

Hari Mankude commented on HDFS-3370:


Can the hard linked files be reopened for append?


 HDFS hardlink
 -

 Key: HDFS-3370
 URL: https://issues.apache.org/jira/browse/HDFS-3370
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Hairong Kuang
Assignee: Liyin Tang
 Attachments: HDFS-HardLink.pdf


 We'd like to add a new feature hardlink to HDFS that allows harlinked files 
 to share data without copying. Currently we will support hardlinking only 
 closed files, but it could be extended to unclosed files as well.
 Among many potential use cases of the feature, the following two are 
 primarily used in facebook:
 1. This provides a lightweight way for applications like hbase to create a 
 snapshot;
 2. This also allows an application like Hive to move a table to a different 
 directory without breaking current running hive queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-05-09 Thread Hari Mankude (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271616#comment-13271616
]

Hari Mankude commented on HDFS-2802:

@Eli,
Regarding scenario #3, consider a hbase setup with huge dataset in production.
A new app has been developed which needs to be validated against production
dataset. It is not feasible to copy the entire dataset to a test setup. At the
same time, app is not ready for production and it is not safe to have the app
modify the data in the production database. One of the solutions for these
types of problems is to take a RW snapshot of the production dataset and then
have the development app run against the RW snapshot. After the app testing is
done, RW snap is deleted. This assumes that the cluster has sufficient compute
capacity and incremental storage capacity to support RW snaps.

Regarding appends, current prototype of snapshot relies on the filesize that is
available at the namenode. So, if a file is appended after snap is taken, then
it is a no-op from a snap perspective. If a snap is taken of a file which has
append pipeline setup, inode is of type underconstruction in the NN. Prototype
relies on filesize that is available on the NN for snaps. This might not be
perfect and I have some ideas on trying to acquire more upto-date filesize.

I thought that truncate is not supported currently in the trunk. If you are
referring to deletes, prototype handles deletes correctly without issues.

I will post a more detailed doc after I am done with HA related work.

Support for RW/RO snapshots in HDFS
---

[jira] [Updated] (HDFS-3293) Implement equals for storageinfo and journainfo class.

2012-04-30 Thread Hari Mankude (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Mankude updated HDFS-3293:
---

Attachment: hdfs-3293-2.patch

 Implement equals for storageinfo and journainfo class. 
 ---

 Key: HDFS-3293
 URL: https://issues.apache.org/jira/browse/HDFS-3293
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Reporter: Hari Mankude
Assignee: Hari Mankude
Priority: Minor
 Attachments: hdfs-3293-1.patch, hdfs-3293-2.patch, hdfs-3293.patch


 Implement equals for storageinfo and journalinfo class. Also journalinfo 
 class needs a toString() method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HDFS-3325) When configuring dfs.namenode.safemode.threshold-pct to a value greater or equal to 1 there is mismatch in the UI report

2012-04-30 Thread Hari Mankude (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hari Mankude reassigned HDFS-3325:
--

Assignee: Hari Mankude

When configuring dfs.namenode.safemode.threshold-pct to a value greater or
equal to 1 there is mismatch in the UI report
--

Key: HDFS-3325
URL: https://issues.apache.org/jira/browse/HDFS-3325
Project: Hadoop HDFS
Issue Type: Bug
Affects Versions: 2.0.0
Reporter: J.Andreina
Assignee: Hari Mankude
Priority: Minor
Fix For: 2.0.0, 3.0.0

When dfs.namenode.safemode.threshold-pct is configured to n
Namenode will be in safemode until n percentage of blocks that should satisfy
the minimal replication requirement defined by
dfs.namenode.replication.min is reported to namenode
But in UI it displays that n percentage of total blocks + 1 blocks are
additionally needed
to come out of the safemode
Scenario 1:

Configurations:
dfs.namenode.safemode.threshold-pct = 2
dfs.replication = 2
dfs.namenode.replication.min =2
Step 1: Start NN,DN1,DN2
Step 2: Write a file a.txt which has got 167 blocks
step 3: Stop NN,DN1,DN2
Step 4: start NN
In UI report the Number of blocks needed to come out of safemode and number
of blocks actually present is different.
{noformat}
Cluster Summary
Security is OFF
Safe mode is ON. The reported blocks 0 needs additional 335 blocks to reach
the threshold 2. of total blocks 167. Safe mode will be turned off
automatically.
2 files and directories, 167 blocks = 169 total.
Heap Memory used 57.05 MB is 2% of Commited Heap Memory 2 GB. Max Heap Memory
is 2 GB.
Non Heap Memory used 23.37 MB is 17% of Commited Non Heap Memory 130.44 MB.
Max Non Heap Memory is 176 MB.{noformat}
Scenario 2:
===
Configurations:
dfs.namenode.safemode.threshold-pct = 1
dfs.replication = 2
dfs.namenode.replication.min =2
Step 1: Start NN,DN1,DN2
Step 2: Write a file a.txt which has got 167 blocks
step 3: Stop NN,DN1,DN2
Step 4: start NN
In UI report the Number of blocks needed to come out of safemode and number
of blocks actually present is different
{noformat}
Cluster Summary
Security is OFF
Safe mode is ON. The reported blocks 0 needs additional 168 blocks to reach
the threshold 1. of total blocks 167. Safe mode will be turned off
automatically.
2 files and directories, 167 blocks = 169 total.
Heap Memory used 56.2 MB is 2% of Commited Heap Memory 2 GB. Max Heap Memory
is 2 GB.
Non Heap Memory used 23.37 MB is 17% of Commited Non Heap Memory 130.44 MB.
Max Non Heap Memory is 176 MB.{noformat}

[jira] [Updated] (HDFS-3293) Implement equals for storageinfo and journainfo class.

2012-04-27 Thread Hari Mankude (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Mankude updated HDFS-3293:
---

Target Version/s: 0.24.0
  Status: Patch Available  (was: Open)

 Implement equals for storageinfo and journainfo class. 
 ---

 Key: HDFS-3293
 URL: https://issues.apache.org/jira/browse/HDFS-3293
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Reporter: Hari Mankude
Assignee: Hari Mankude
Priority: Minor
 Attachments: hdfs-3293.patch


 Implement equals for storageinfo and journalinfo class. Also journalinfo 
 class needs a toString() method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3293) Implement equals for storageinfo and journainfo class.

2012-04-27 Thread Hari Mankude (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Mankude updated HDFS-3293:
---

Attachment: hdfs-3293.patch

 Implement equals for storageinfo and journainfo class. 
 ---

 Key: HDFS-3293
 URL: https://issues.apache.org/jira/browse/HDFS-3293
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Reporter: Hari Mankude
Assignee: Hari Mankude
Priority: Minor
 Attachments: hdfs-3293.patch


 Implement equals for storageinfo and journalinfo class. Also journalinfo 
 class needs a toString() method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3293) Implement equals for storageinfo and journainfo class.

2012-04-27 Thread Hari Mankude (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263987#comment-13263987
 ] 

Hari Mankude commented on HDFS-3293:


Changes are trivial. So test is not included.

 Implement equals for storageinfo and journainfo class. 
 ---

 Key: HDFS-3293
 URL: https://issues.apache.org/jira/browse/HDFS-3293
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Reporter: Hari Mankude
Assignee: Hari Mankude
Priority: Minor
 Attachments: hdfs-3293.patch


 Implement equals for storageinfo and journalinfo class. Also journalinfo 
 class needs a toString() method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-3205) testHANameNodesWithFederation is failing in trunk

2012-04-27 Thread Hari Mankude (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Mankude resolved HDFS-3205.


Resolution: Duplicate

This is a dup of hdfs-2960

 testHANameNodesWithFederation is failing in trunk
 -

 Key: HDFS-3205
 URL: https://issues.apache.org/jira/browse/HDFS-3205
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, name-node
Reporter: Hari Mankude
Assignee: Hari Mankude
Priority: Minor

 The test is failing with the error
 org.junit.ComparisonFailure: expected:ns1-nn1.example.com[]:8020 but 
 was:ns1-nn1.example.com[/50.28.50.93]:8020

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3293) Implement equals for storageinfo and journainfo class.

2012-04-27 Thread Hari Mankude (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Mankude updated HDFS-3293:
---

Attachment: hdfs-3293-1.patch

 Implement equals for storageinfo and journainfo class. 
 ---

 Key: HDFS-3293
 URL: https://issues.apache.org/jira/browse/HDFS-3293
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Reporter: Hari Mankude
Assignee: Hari Mankude
Priority: Minor
 Attachments: hdfs-3293-1.patch, hdfs-3293.patch


 Implement equals for storageinfo and journalinfo class. Also journalinfo 
 class needs a toString() method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3293) Implement equals for storageinfo and journainfo class.

2012-04-27 Thread Hari Mankude (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264022#comment-13264022
 ] 

Hari Mankude commented on HDFS-3293:


Fixed all the issues mentioned by Nicholas.

 Implement equals for storageinfo and journainfo class. 
 ---

 Key: HDFS-3293
 URL: https://issues.apache.org/jira/browse/HDFS-3293
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Reporter: Hari Mankude
Assignee: Hari Mankude
Priority: Minor
 Attachments: hdfs-3293-1.patch, hdfs-3293.patch


 Implement equals for storageinfo and journalinfo class. Also journalinfo 
 class needs a toString() method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-7056) Snapshot support for truncate

[jira] [Commented] (HDFS-5427) not able to read deleted files from snapshot directly under snapshottable dir after checkpoint and NN restart

[jira] [Commented] (HDFS-4872) Idempotent delete operation.

[jira] [Created] (HDFS-4873) callGetBlockLocations returns incorrect number of blocks for snapshotted files

[jira] [Commented] (HDFS-4873) callGetBlockLocations returns incorrect number of blocks for snapshotted files

[jira] [Commented] (HDFS-4873) callGetBlockLocations returns incorrect number of blocks for snapshotted files

[jira] [Commented] (HDFS-4817) make HDFS advisory caching configurable on a per-file basis

[jira] [Commented] (HDFS-4817) make HDFS advisory caching configurable on a per-file basis

[jira] [Commented] (HDFS-4750) Support NFSv3 interface to HDFS

[jira] [Commented] (HDFS-4750) Support NFSv3 interface to HDFS

[jira] [Commented] (HDFS-4758) Disallow nested snapshottable directories

[jira] [Commented] (HDFS-4758) Disallow nested snapshottable directories

[jira] [Commented] (HDFS-2576) Namenode should have a favored nodes hint to enable clients to have control over block placement.

[jira] [Commented] (HDFS-4087) Protocol changes for listSnapshots functionality

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Updated] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Commented] (HDFS-3370) HDFS hardlink

[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

[jira] [Updated] (HDFS-3293) Implement equals for storageinfo and journainfo class.

[jira] [Assigned] (HDFS-3325) When configuring dfs.namenode.safemode.threshold-pct to a value greater or equal to 1 there is mismatch in the UI report

[jira] [Updated] (HDFS-3293) Implement equals for storageinfo and journainfo class.

[jira] [Updated] (HDFS-3293) Implement equals for storageinfo and journainfo class.

[jira] [Commented] (HDFS-3293) Implement equals for storageinfo and journainfo class.

[jira] [Resolved] (HDFS-3205) testHANameNodesWithFederation is failing in trunk

[jira] [Updated] (HDFS-3293) Implement equals for storageinfo and journainfo class.

[jira] [Commented] (HDFS-3293) Implement equals for storageinfo and journainfo class.

31 matches

Site Navigation

Mail list logo

Footer information