>From Michael Blow <[email protected]>:

Michael Blow has posted comments on this change. ( 
https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/10064 )

Change subject: [NO ISSUE][HYR][STO] Transform long filenames / paths
......................................................................


Patch Set 6: Code-Review+1

(10 comments)

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/10064/4/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/io/SafePathTransformer.java
File 
hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/io/SafePathTransformer.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/10064/4/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/io/SafePathTransformer.java@59
PS4, Line 59:            try {
            :                 registry.load(new InputStreamReader(new 
FileInputStream(registryFile), StandardCharsets.UTF_8));
> try with resources for the streams?
Done


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/10064/4/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/io/SafePathTransformer.java@84
PS4, Line 84: if (!candidate.equals(finalPath)) {
> why candidate and not originalpath? we only store in the registry if length 
> needs to be translated?
i went back & forth on this, but decided to keep the registry minimally-sized- 
it wouldn't take much convincing to flop back- the trade-off is registry size 
-vs- the one-time re-evaluation of a "safe" path each startup of the NC- it 
didn't seem too bad to do the duplicate work, since it's only once per boot...


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/10064/4/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/io/SafePathTransformer.java@118
PS4, Line 118:  replacement += nextFixupToken();
> javac translates this sorta stuff into stringbuilder now right?
yeah


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/10064/4/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/io/SafePathTransformer.java@131
PS4, Line 131:               try {
             :                     registry.store(new 
FileWriterWithEncoding(registryFile, StandardCharsets.UTF_8),
             :                             "Generated by " + 
getClass().getName());
> try with resources? also i think this does flush but is that sufficient to 
> ensure durability?
Done


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/10064/4/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/io/SafePathTransformer.java@180
PS4, Line 180: "~"
> lol MICROS~1
Ack


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/10064/4/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/io/SafePathTransformer.java@184
PS4, Line 184:  return utf8Length(file.getAbsolutePath());
> what if the path is not normalized?
Done


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/10064/4/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/io/SafePathTransformer.java@187
PS4, Line 187:     protected static int utf8PathLength(Path pathSegment) {
> unused?
Done


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/10064/4/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/io/SafePathTransformer.java@198
PS4, Line 198:      return element.substring(0, pathSegmentBytesLimit - 10 - 
MAX_FIXUP_LENGTH) + "..."
             :                     + element.substring(element.length() - 10);
> doesn't there need to be some iteration here to figure out how many chars to 
> remove to scrape under  […]
no, we only patching one segment here- so we know that we will now be under the 
limit with the substrings + the fixup we add after we return


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/10064/4/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/io/UnixPathTransformerFactory.java
File 
hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/io/UnixPathTransformerFactory.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/10064/4/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/io/UnixPathTransformerFactory.java@34
PS4, Line 34:  private static final int PATH_OVERALL_BYTES_LIMIT = Integer
            :                 
.getInteger(UnixPathTransformerFactory.class.getName() + 
".pathOverallBytesLimit", Integer.MAX_VALUE);
> from poking around in google it seems like this is 4096 in linux (at least as 
> a lower bound)?
but yet it's not, in actuality- linux has this compiled into the headers but 
most (all?) filesystems don't have any such limit -- not really sure what to do 
here, but i guess conservative is safest


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/10064/4/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/io/UnixPathTransformerFactory.java@43
PS4, Line 43:       
> isn't NUL (\0) usually illegal?
do we allow NUL \0 in our parser?  i guess i didn't try- i updated this to 
replace



--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/10064
To unsubscribe, or for help writing mail filters, visit 
https://asterix-gerrit.ics.uci.edu/settings

Gerrit-Project: asterixdb
Gerrit-Branch: cheshire-cat
Gerrit-Change-Id: I26d7223611cc68e0f99c9c9eead9c0a344a053e7
Gerrit-Change-Number: 10064
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Blow <[email protected]>
Gerrit-Reviewer: Ali Alsuliman <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Michael Blow <[email protected]>
Gerrit-Reviewer: Murtadha Hubail <[email protected]>
Gerrit-Reviewer: Till Westmann <[email protected]>
Gerrit-CC: Ian Maxon <[email protected]>
Gerrit-Comment-Date: Fri, 26 Feb 2021 20:50:02 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: Yes
Comment-In-Reply-To: Ian Maxon <[email protected]>
Gerrit-MessageType: comment

Reply via email to