[ 
https://issues.apache.org/jira/browse/TIKA-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregory Lepore updated TIKA-4067:
---------------------------------
    Description: 
The ASPRS Lidar data format occurs over 11,000 times in the latest Common Crawl 
dataset. There are three signatures to cover the three major versions. There 
does not appear to be a mime type that covers this format.

The full signatures are below, but they can be simplified to:

 

4C415346\{20}01(00|01|02)

 

which is ASCII LASF, followed after 20 bytes by one of 0100, 0101, or 0102 for 
versions 1.0, 1.1, and 1.2.

 

https://www.nationalarchives.gov.uk/PRONOM/fmt/370

 
||External signatures|File extension: las
File extension: laz|
||Internal signatures||
||Name|ASPRS Lidar Data Exchange Format 1.0|
||Description|ASCII header: LASF, followed after 20 bytes by version number 1.0|
||Byte sequences||
||Position type|Absolute from BOF|
||Offset|0|
||Byte order| |
||Value|4C415346\{20}0100\{78}[00:99]|



 
||External signatures|File extension: las
File extension: laz|
||Internal signatures||
||Name|ASPRS Lidar Data Exchange Format 1.1|
||Description|ASCII header: LASF, followed after 20 bytes by version number 1.1|
||Byte sequences||
||Position type|Absolute from BOF|
||Offset|0|
||Byte order| |
||Value|4C415346\{20}0101\{78}[00:99]|



 
||External signatures|File extension: las
File extension: laz|
||Internal signatures||
||Name|ASPRS Lidar Data Exchange Format 1.2|
||Description|ASCII header: LASF, followed after 20 bytes by version number 1.2|
||Byte sequences||
||Position type|Absolute from BOF|
||Offset|0|
||Byte order| |
||Value|4C415346\{20}0102\{78}[00:99]|



 

 

  was:
The ASPRS Lidar data format occurs over 11,000 times in the latest Common Crawl 
dataset. There are three signatures to cover the three major versions. There 
does not appear to be a mime type that covers this format.

The full signatures are below, but they can be simplified to:

 

4C415346\{20}01(00|01|02)

 

which is ASCII LASF, followed after 20 bytes by one of 0100, 0101, or 0102 for 
versions 1.0, 1.1, and 1.2.

 
||External signatures|File extension: las
File extension: laz|
||Internal signatures|
||Name|ASPRS Lidar Data Exchange Format 1.0|
||Description|ASCII header: LASF, followed after 20 bytes by version number 1.0|
||Byte sequences|
||Position type|Absolute from BOF|
||Offset|0|
||Byte order| |
||Value|4C415346\{20}0100\{78}[00:99]|
|
|

 
||External signatures|File extension: las
File extension: laz|
||Internal signatures|
||Name|ASPRS Lidar Data Exchange Format 1.1|
||Description|ASCII header: LASF, followed after 20 bytes by version number 1.1|
||Byte sequences|
||Position type|Absolute from BOF|
||Offset|0|
||Byte order| |
||Value|4C415346\{20}0101\{78}[00:99]|
|
|

 
||External signatures|File extension: las
File extension: laz|
||Internal signatures|
||Name|ASPRS Lidar Data Exchange Format 1.2|
||Description|ASCII header: LASF, followed after 20 bytes by version number 1.2|
||Byte sequences|
||Position type|Absolute from BOF|
||Offset|0|
||Byte order| |
||Value|4C415346\{20}0102\{78}[00:99]|
|
|

 

 


> Add magic for ASPRS Lidar data
> ------------------------------
>
>                 Key: TIKA-4067
>                 URL: https://issues.apache.org/jira/browse/TIKA-4067
>             Project: Tika
>          Issue Type: Sub-task
>            Reporter: Gregory Lepore
>            Priority: Minor
>         Attachments: 
> 0a0b002a319eea990e13da3d197fe4948e2cb8b72d02d5fa19c42382b1548a63, 
> 0a0b757ac6ca7692a14645aed1ff2c2f5d7db11533087dac1855b521929d71c0, 
> 0a0d0890d50d693831e5e3bb9f08e927760609408eaa5d86cf6e0d56e122e0e1
>
>
> The ASPRS Lidar data format occurs over 11,000 times in the latest Common 
> Crawl dataset. There are three signatures to cover the three major versions. 
> There does not appear to be a mime type that covers this format.
> The full signatures are below, but they can be simplified to:
>  
> 4C415346\{20}01(00|01|02)
>  
> which is ASCII LASF, followed after 20 bytes by one of 0100, 0101, or 0102 
> for versions 1.0, 1.1, and 1.2.
>  
> https://www.nationalarchives.gov.uk/PRONOM/fmt/370
>  
> ||External signatures|File extension: las
> File extension: laz|
> ||Internal signatures||
> ||Name|ASPRS Lidar Data Exchange Format 1.0|
> ||Description|ASCII header: LASF, followed after 20 bytes by version number 
> 1.0|
> ||Byte sequences||
> ||Position type|Absolute from BOF|
> ||Offset|0|
> ||Byte order| |
> ||Value|4C415346\{20}0100\{78}[00:99]|
>  
> ||External signatures|File extension: las
> File extension: laz|
> ||Internal signatures||
> ||Name|ASPRS Lidar Data Exchange Format 1.1|
> ||Description|ASCII header: LASF, followed after 20 bytes by version number 
> 1.1|
> ||Byte sequences||
> ||Position type|Absolute from BOF|
> ||Offset|0|
> ||Byte order| |
> ||Value|4C415346\{20}0101\{78}[00:99]|
>  
> ||External signatures|File extension: las
> File extension: laz|
> ||Internal signatures||
> ||Name|ASPRS Lidar Data Exchange Format 1.2|
> ||Description|ASCII header: LASF, followed after 20 bytes by version number 
> 1.2|
> ||Byte sequences||
> ||Position type|Absolute from BOF|
> ||Offset|0|
> ||Byte order| |
> ||Value|4C415346\{20}0102\{78}[00:99]|
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to