Github user RongGu commented on the pull request:
https://github.com/apache/spark/pull/158#issuecomment-38952734
Hi Patrick.
Thank you for comments! The github web site is not accessible currently in
my location now. So, I have to send this email to discuss with you about my
latest update.
In fact, in my old version, the TachyonFilePathResolver interface along
with the getBlockLocation() are used by TachyonStore.getSize(blockId:
BlockId) to get the size of a block. The information is further used for
the stroage usage metrics in UI or something. I added this similar as the
DiskStore's PathResolver interface. However, as you suggested, to make the
code more concise, I have directly get the size from the tachyonFile now.
This way, we haved removed a lot of unnecessary codes here.
As the subdirectories issue, I suggest you to keep it. Becuase for some
large dataset, the block number on one executor can easily go up to
thousands even millions. I am afraid that in that time, we have to add this
piece of code again. Also, I aggree with haoyuan to set the number small
now. Thanks.
Regards.
Rong Ru
2014-03-29 2:05 GMT+08:00 Patrick Wendell <[email protected]>:
> @haoyuan <https://github.com/haoyuan> hey HY - the issue is mostly around
> keeping the complexity of the code minimal and avoiding a bunch of code
> duplication. The code around deleting this subdirectories is not trivial
> and right now the functions are just copy/pasted for Tachyon. I'll look at
> it a bit more...
>
> --
> Reply to this email directly or view it on
GitHub<https://github.com/apache/spark/pull/158#issuecomment-38950091>
> .
>
--
------------------
Rong Gu
Department of Computer Science and Technology
State Key Laboratory for Novel Software Technology
Nanjing University
Phone: +86 15850682791
Email: [email protected]
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---