[ 
https://issues.apache.org/jira/browse/FLUME-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15548622#comment-15548622
 ] 

Jason Kushmaul commented on FLUME-2994:
---------------------------------------

Uniqueness of FIleKey.hashCode:
Can you be more specific about why it might not be as unique?  My hope was that 
with this hashCode override: 
http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/windows/classes/sun/nio/ch/FileKey.java
{noformat}
   52      public int hashCode() {
   53           return (int)(dwVolumeSerialNumber ^ (dwVolumeSerialNumber >>> 
32)) +
   54                  (int)(nFileIndexHigh ^ (nFileIndexHigh >>> 32)) +
   55                  (int)(nFileIndexLow ^ (nFileIndexHigh >>> 32));
   56       }
{noformat}
I'm not defending it, it's more that I can't tell you how unique that will be, 
so I was hoping you could do the opposite and tell my how unique it will not 
be.  What I can tell you is that from run to run, the same value was achieved, 
and was different for the very small number of files I tested.

I think this is warranted now - I will provide some data on this and how unique 
it is.  If you have any suggestions on that please let me know and I'll be sure 
to include them, otherwise, I'll just get started with what I am thinking of 
right now which is to generate a configurable amount of files and then check 
the fileKey.hashCode on them for uniqueness.  Crude but I think will prove it 
worthy (or not).

tailFiles Map:
The only place FileKey is used is to get an "inode" like value on windows so I 
don't think we should use that in tailFiles map as it would proliferate windows 
workaround object to the rest of the code rather than keeping it contained in 
that single function.  (Did I misread what you were asking).
I would continue to use Long in tailFiles map, because on unix, that is the 
primary way to identify the files other than path (which path can change if a 
file is "mv"d). 
  

> flume-taildir-source: support for windows
> -----------------------------------------
>
>                 Key: FLUME-2994
>                 URL: https://issues.apache.org/jira/browse/FLUME-2994
>             Project: Flume
>          Issue Type: Improvement
>          Components: Sinks+Sources, Windows
>    Affects Versions: v1.7.0
>            Reporter: Jason Kushmaul
>            Assignee: Jason Kushmaul
>            Priority: Trivial
>             Fix For: v1.7.0
>
>         Attachments: FLUME-2994-2.patch, taildir-mac.conf, taildir-win8.1.conf
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> The current implementation of flume-taildir-source does not support windows.
> The only reason for this from what I can see is a simple call to 
> Files.getAttribute(file.toPath(), "unix:ino");
> I've tested an equivalent for windows (which of course does not work on 
> non-windows).  With an OS switch we should be able to identify a file 
> independent of file name on either system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to