[jira] [Work logged] (HADOOP-17347) ABFS: Optimise read for small files/tails of files

ASF GitHub Bot (Jira) Sat, 09 Jan 2021 19:58:22 -0800


     [ 
https://issues.apache.org/jira/browse/HADOOP-17347?focusedWorklogId=533576&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533576
 ]


ASF GitHub Bot logged work on HADOOP-17347:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 10/Jan/21 03:57
            Start Date: 10/Jan/21 03:57
    Worklog Time Spent: 10m 
      Work Description: steveloughran commented on pull request #2464:
URL: https://github.com/apache/hadoop/pull/2464#issuecomment-756759257


   It's moot now that the PR is merged, but can I remind everyone
   
   ## Add a meaningful message to assertTrue/assertFalse asserts
   
   Imagine that you are trying to debug a test run from an automated build. All 
you have is that an assert failed on a given line. Does that provide enough 
information to diagnose the problem? Or would you need extra information? If 
so: what information should be included?
   
   It's OK to use assertJ's assertThat, which is where new tests are going. 
It's a bit more verbose but its assertions are very informative and easily 
extensible. If you haven't used the library yet -it's on the classpath, try 
using its assertions in new test suites.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 533576)
    Time Spent: 12.5h  (was: 12h 20m)

> ABFS: Optimise read for small files/tails of files
> --------------------------------------------------
>
>                 Key: HADOOP-17347
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17347
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure
>    Affects Versions: 3.4.0
>            Reporter: Bilahari T H
>            Assignee: Bilahari T H
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.4.0
>
>          Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> Optimize read performance for the following scenarios
>  # Read small files completely
>  Files that are of size smaller than the read buffer size can be considered 
> as small files. In case of such files it would be better to read the full 
> file into the AbfsInputStream buffer.
>  # Read last block if the read is for footer
>  If the read is for the last 8 bytes, read the full file.
>  This will optimize reads for parquet files. [Parquet file 
> format|https://www.ellicium.com/parquet-file-format-structure/]
> Both these optimizations will be present under configs as follows
>  # fs.azure.read.smallfilescompletely
>  # fs.azure.read.optimizefooterread



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Work logged] (HADOOP-17347) ABFS: Optimise read for small files/tails of files

Reply via email to