Github user RongGu commented on the pull request:

    https://github.com/apache/spark/pull/158#issuecomment-38953232
  
    I'm playing around with testing this patch locally today... will let you
    know how it goes :)
    
    
    On Fri, Mar 28, 2014 at 11:32 AM, Patrick Wendell 
<[email protected]>wrote:
    
    > Hey Rong,
    >
    > I just didn't know this was necessary in Tachyon. But if we keep it, yes
    > let's just keep the number of directories at 64. The main issue was just
    > the code complexity. It seemed a little ugly to have all this duplicated
    > code from the DiskStore - I feel it might be possible to consolidate it
    > more.
    >
    > But at this point I think it's okay to just put that as a TODO.
    >
    > - Patrick
    >
    >
    > On Fri, Mar 28, 2014 at 11:29 AM, 顾荣 <[email protected]> wrote:
    >
    >> Hi Patrick.
    >>
    >> Thank you for comments! The github web site is not accessible currently
    >> in my location now. So, I have to send this email to discuss with you 
about
    >> my latest update.
    >>
    >> In fact,  in my old version, the TachyonFilePathResolver interface along
    >> with the getBlockLocation() are used by TachyonStore.getSize(blockId:
    >> BlockId) to get the size of a block. The information is further used for
    >> the stroage usage metrics in UI or something. I added this similar as the
    >> DiskStore's PathResolver interface. However, as you suggested, to make 
the
    >> code more concise, I have directly get the size from the tachyonFile now.
    >> This way, we haved removed a lot of unnecessary codes here.
    >>
    >> As the subdirectories issue, I suggest you to keep it. Becuase  for some
    >> large dataset, the block number on one executor can easily go up to
    >> thousands even millions. I am afraid that in that time, we have to add 
this
    >> piece of code again. Also, I aggree with haoyuan to set the number small
    >> now. Thanks.
    >>
    >> Regards.
    >> Rong Ru
    >>
    >>
    >>
    >> 2014-03-29 2:05 GMT+08:00 Patrick Wendell <[email protected]>:
    >>
    >>> @haoyuan <https://github.com/haoyuan> hey HY - the issue is mostly
    >>> around keeping the complexity of the code minimal and avoiding a bunch 
of
    >>> code duplication. The code around deleting this subdirectories is not
    >>> trivial and right now the functions are just copy/pasted for Tachyon. 
I'll
    >>> look at it a bit more...
    >>>
    >>> —
    >>> Reply to this email directly or view it on 
GitHub<https://github.com/apache/spark/pull/158#issuecomment-38950091>
    >>> .
    >>>
    >>
    >>
    >>
    >> --
    >> ------------------
    >> Rong Gu
    >> Department of Computer Science and Technology
    >> State Key Laboratory for Novel Software Technology
    >> Nanjing University
    >> Phone: +86 15850682791
    >> Email: [email protected]
    >>
    >
    >


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to