Gary Gregory created VFS-483:
--------------------------------

             Summary: [RAM] Many suggestions to improve the RAM file provider.
                 Key: VFS-483
                 URL: https://issues.apache.org/jira/browse/VFS-483
             Project: Commons VFS
          Issue Type: Improvement
    Affects Versions: 2.0
         Environment: Apache Maven 3.0.5 
(r01de14724cdef164cd33c7c8c2fe155faf9602da; 2013-02-19 08:51:28-0500)
Maven home: C:\Java\apache-maven-3.0.5\bin\..
Java version: 1.7.0_25, vendor: Oracle Corporation
Java home: C:\Program Files\Java\jdk1.7.0_25\jre
Default locale: en_US, platform encoding: Cp1252
OS name: "windows 7", version: "6.1", arch: "amd64", family: "windows"
            Reporter: Gary Gregory
            Assignee: Gary Gregory




---------- Forwarded message ----------
From: Bernd Eckenfels <[email protected]>
Date: Wed, Aug 7, 2013 at 5:36 PM
Subject: [VFS] RAM Provider review and questions
To: [email protected]


Hello,

I had a look at the VFS2 RAM Provider in 2.0 and have some questions comments:


RamFileData.java
----------------
#52 byte[] buffer - this is the actual content of the file, I would name it 
that way. Buffer sounds transient

#62 Collection<RamFileData> - this keeps references to all childer, aka the 
whole tree. When a implementation gets smarter about cache/buffers this could 
harm a partial (off heap) storage and similiar things. Change this to String[] 
basenames? (especially since lookup is rather fast in RAM)

#71 children = Collections.synchronizedCollection(new ArrayList<RamFileData>()) 
- there are multiple contains() lookups on this collection. This can be slow on 
larger directories. Would it make sense to have a adaptive data structure here 
(or provide a general infrastructure for small-to-large collections?

#136 this.buffer = new byte[0]; - I would use a "final static byte[] EMPTY" to 
avoid allocation

#186 contains/add - the "if contains throw / add" construct is quite readable 
however it requires two instead of one linear search and since it is not 
synchronized there is a window for racecondition. Using the return boolean of 
the add() method (on a collection which refuses duplicates) is more efficient.

#208 contains/remove - same argument first checking then removing has a race 
and is inefficient. remove returns null if it was not in there.

#225 return children; - is it safe to return a modifyable collection? See also 
RamFileSystem#108 which synchronizes across class boundaries.

#244 equals / RamFileData data = (RamFileData) o; - nitpick: i would use "that" 
or "other", "data" looks local

#256 this.getName().hashCode() - is it intentional to access field via getter, 
toString uses the field

#281: System.arraycopy(this.buffer, 0, newBuf, 0, size); - this will fail if 
resize() truncates. Is it guranteed not to happen?

I think the whole resize is generally strange, only using the setBuf() is more 
atomic and does not leak uninitilized bytes.


RamFileObject.java
------------------
#95 this.data.getBuffer().length; - use data.size() or even this.size()?

#125 setBuffer(new byte[0]) - use EMPTY constant

#272 does it make sense to put the whole quota calculation into fs. (especially 
the option builder)

#276 when "newSize - size" is negative the resize() is a truncation, it should 
not be rejected even when the overall size is over maxsize (can this actually 
happen?)

RamFileOutputStream.java
------------------------
#63 introduce "RamFileObject data = this.file.getData()" as it is dereferenced 
3 times in this method

#75 file.getData().getBuffer() - does not update last update date (if this is 
intended then should comment)

#61 I would change the whole logic to use setBuffer()

#87 write(buffer1) - replace with write(buffer1, 0, 1) this safes one method 
invocation

RamFileProvider.java
--------------------
#34 I think this is an unusal " *" line :)

RamFileSystem.java
------------------
#54 cache - is this actually a cache or is it the actual content store (see 
above RamFileData#62 for parent/child references). If it is intended to be the 
real storage it should be names that way.

#212 the Null argument is a bit missleading as the argument was not null by the 
state was imaginary.

#263 what a creative way to copy an input stream to an output stream. There is 
no need to Buffer the Ramoutput stream and you should never use the byte-wise 
read if not needed. I also wonder if there is no IOUtil to do that. Besides in 
this specific case it would be better to ask the source for size, resize the 
byte[] to it, read it fully and setBuffer() it (atomically). (and 512b is also 
a very small buffer, it should be more like 32k)

#273 the flush is not needed, close will flush (especially if you not buffer 
later on the flush is a nop)

#290 getClass/getMessage is seldom/never used anywhere else. FileSystemExeption 
should really have an IOExepction types constructor to be used consistently.

#306 int size() limits the maximum size for the Ram cache to 2GB, should be 
long (see also RamFileSystemConfigBuilder#68 which sets default to 
Integer.MAX_VALUE)

#306 fs.size() is actually used very often (especially when writing small 
write()s to the OutputStream. It is a synchronized full-cach-walker, looks like 
a bad bottlenek. Maybe it is better if a outputstream requests the size only 
once and calculates locally or use an atomicLong to account all changes instead 
of walking.

Generally I am not sure how the synchronisation/concurrency model of the 
providers looks like, so I did not comment on those aspects, but it looks like 
some methods exepct to be executed atomic.

Greetings
Bernd
-- 
http://bernd.eckenfels.net

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to