mbeckerle commented on code in PR #1274:
URL: https://github.com/apache/daffodil/pull/1274#discussion_r1725144259


##########
daffodil-cli/src/main/scala/org/apache/daffodil/cli/Main.scala:
##########
@@ -1165,13 +1168,24 @@ class Main(
           case Some(processor) => {
             Assert.invariant(!processor.isError)
             val input = parseOpts.infile.toOption match {
-              case Some("-") | None => STDIN
+              case Some("-") | None => InputSourceDataInputStream(STDIN)
               case Some(file) => {
-                val f = new File(file)
-                new FileInputStream(f)
+                // for files <= 2GB, use a mapped byte buffer to avoid the 
overhead related to
+                // the BucketingInputSource. Larger files cannot be mapped so 
we cannot avoid it
+                val path = Paths.get(file)
+                val size = Files.size(path)
+                if (size <= Int.MaxValue) {

Review Comment:
   doc for the map method says: 
   
       For most operating systems, mapping a file into memory is more expensive 
than reading or writing a 
       few tens of kilobytes of data via the usual read
   
   So should there be a floor check also e.g., below some size we just read it 
into a byte buffer and avoid the map?
   



##########
daffodil-cli/src/main/scala/org/apache/daffodil/cli/Main.scala:
##########
@@ -1165,13 +1168,24 @@ class Main(
           case Some(processor) => {
             Assert.invariant(!processor.isError)
             val input = parseOpts.infile.toOption match {
-              case Some("-") | None => STDIN
+              case Some("-") | None => InputSourceDataInputStream(STDIN)
               case Some(file) => {
-                val f = new File(file)
-                new FileInputStream(f)
+                // for files <= 2GB, use a mapped byte buffer to avoid the 
overhead related to
+                // the BucketingInputSource. Larger files cannot be mapped so 
we cannot avoid it
+                val path = Paths.get(file)
+                val size = Files.size(path)
+                if (size <= Int.MaxValue) {
+                  val fc = FileChannel.open(path, StandardOpenOption.READ)
+                  val bb = fc.map(FileChannel.MapMode.READ_ONLY, 0, size)

Review Comment:
   This is in the CLI. Could this be done inside the API so that all 
applications benefit from it? 
   
   E.g, InputSourceDataInputStream(is) analyzes the input stream to see if it 
is a file and of the needed size?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to