Hi, thanks for your reply Dhruba,
One of my co-workers is writing a BigTable-like application that could be
used for online, near-real-time, services. So since the application could be
hooked into online services, there would times when a large number of users
(e.g. 1000 users) request to access few files in a very short time.
Of course, in a batch process job, this is a rare case, but for online
services, it's quite a common case.
I think HBase developers would have run into similar issues as well.
Is this enough explanation?
Thanks in advance,
Taeho
On Tue, Nov 4, 2008 at 3:12 AM, Dhruba Borthakur <[EMAIL PROTECTED]> wrote:
> In the current code, details about block locations of a file are
> cached on the client when the file is opened. This cache remains with
> the client until the file is closed. If the same file is re-opened by
> the same DFSClient, it re-contacts the namenode and refetches the
> block locations. This works ok for most map-reduce apps because it is
> rare that the same DSClient re-opens the same file again.
>
> Can you pl explain your use-case?
>
> thanks,
> dhruba
>
>
> On Sun, Nov 2, 2008 at 10:57 PM, Taeho Kang <[EMAIL PROTECTED]> wrote:
> > Dear Hadoop Users and Developers,
> >
> > I was wondering if there's a plan to add "file info cache" in DFSClient?
> >
> > It could eliminate network travelling cost for contacting Namenode and I
> > think it would greatly improve the DFSClient's performance.
> > The code I was looking at was this
> >
> > -----------------------
> > DFSClient.java
> >
> > /**
> > * Grab the open-file info from namenode
> > */
> > synchronized void openInfo() throws IOException {
> > /* Maybe, we could add a file info cache here! */
> > LocatedBlocks newInfo = callGetBlockLocations(src, 0, prefetchSize);
> > if (newInfo == null) {
> > throw new IOException("Cannot open filename " + src);
> > }
> > if (locatedBlocks != null) {
> > Iterator<LocatedBlock> oldIter =
> > locatedBlocks.getLocatedBlocks().iterator();
> > Iterator<LocatedBlock> newIter =
> > newInfo.getLocatedBlocks().iterator();
> > while (oldIter.hasNext() && newIter.hasNext()) {
> > if (!
> oldIter.next().getBlock().equals(newIter.next().getBlock()))
> > {
> > throw new IOException("Blocklist for " + src + " has
> changed!");
> > }
> > }
> > }
> > this.locatedBlocks = newInfo;
> > this.currentNode = null;
> > }
> > -----------------------
> >
> > Does anybody have an opinion on this matter?
> >
> > Thank you in advance,
> >
> > Taeho
> >
>