Akihiko Kusanagi created DRILL-4364:
---------------------------------------
Summary: Image Metadata Format Plugin
Key: DRILL-4364
URL: https://issues.apache.org/jira/browse/DRILL-4364
Project: Apache Drill
Issue Type: New Feature
Components: Storage - Other
Reporter: Akihiko Kusanagi
Support querying of metadata in various image formats. This plugin leverages
[metadata-extractor|https://github.com/drewnoakes/metadata-extractor]. This
plugin is especially useful when querying on a large number of image files
stored in a distributed file system without building metadata repository in
advance.
This plugin supports the following file formats.
* JPEG, TIFF, WebP, PSD, PNG, BMP, GIF, ICO, PCX
* Camera Raw: NEF (Nikon), CR2 (Canon), ORF (Olympus), ARW (Sony), RW2
(Panasonic), RWL (Leica), SRW (Samsung)
This plugin enables to read the following metadata.
* Exif, IPTC, XMP, JFIF / JFXX, ICC Profiles, Photoshop fields, WebP
properties, PNG properties, BMP properties, GIF properties, ICO properties, PCX
properties
Since each type of metadata has a different set of fields, the plugin returns a
set of commonly-used fields such as the image width, height and bits per pixels
for ease of use.
*Examples:*
Querying on a JPEG file with the property descriptive: true
{noformat}
0: jdbc:drill:zk=local> select * from dfs.`4349313028_f69ffa0257_o.jpg`;
+----------+----------+--------------+--------+----------+-----------+------------+-------------+--------------+------------+-----------+----------+-----+---------------+------+------+------+------------+----------+------------------+
| FileName | FileSize | FileDateTime | Format | DPIWidth | DPIHeight |
PixelWidth | PixelHeight | BitsPerPixel | Orientaion | ColorMode | HasAlpha |
GPS | ExifThumbnail | JFIF | IPTC | JPEG | ExifSubIFD | ExifIFD0 |
Interoperability |
+----------+----------+--------------+--------+----------+-----------+------------+-------------+--------------+------------+-----------+----------+-----+---------------+------+------+------+------------+----------+------------------+
| 4349313028_f69ffa0257_o.jpg | 257213 bytes | Mon Feb 01 18:00:56 JST 2016 |
JPEG | 96.0 | 96.0 | 1199 | 800 | 24 | Unknown (0) | RGB | false |
{"GPSVersionID":".022","GPSLatitudeRef":"N","GPSLatitude":"47° 32'
15.98\"","GPSLongitudeRef":"W","GPSLongitude":"-122° 2'
6.37\"","GPSAltitudeRef":"Sea level","GPSAltitude":"0 metres"} |
{"ThumbnailCompression":"JPEG (old-style)","XResolution":"72 dots per
inch","YResolution":"72 dots per
inch","ResolutionUnit":"Inch","ThumbnailOffset":"414
bytes","ThumbnailLength":"7213 bytes"} |
{"Version":"1.1","ResolutionUnits":"inch","XResolution":"96
dots","YResolution":"96 dots"} | {"Keywords":"135;2002;issaquah;police
car;wa;washington"} | {"CompressionType":"Baseline","DataPrecision":"8
bits","ImageHeight":"800 pixels","ImageWidth":"1199
pixels","NumberOfComponents":"3","Component1":"Y component: Quantization table
0, Sampling factors 2 horiz/2 vert","Component2":"Cb component: Quantization
table 1, Sampling factors 1 horiz/1 vert","Component3":"Cr component:
Quantization table 1, Sampling factors 1 horiz/1 vert"} |
{"ExifVersion":"2.10","UniqueImageID":"d65e93b836d15a0c5e041e6b7258c76e"} |
{"Software":"Picasa 3.0"} | {"InteroperabilityIndex":"Unknown (
)","InteroperabilityVersion":"1.00"} |
+----------+----------+--------------+--------+----------+-----------+------------+-------------+--------------+------------+-----------+----------+-----+---------------+------+------+------+------------+----------+------------------+
1 row selected (1.712 seconds)
{noformat}
Querying on a JPEG file with the property descriptive: false
{noformat}
0: jdbc:drill:zk=local> select * from dfs.`4349313028_f69ffa0257_o.jpg`;
+----------+----------+--------------+--------+----------+-----------+------------+-------------+--------------+------------+-----------+----------+-----+---------------+------+------+------+------------+----------+------------------+
| FileName | FileSize | FileDateTime | Format | DPIWidth | DPIHeight |
PixelWidth | PixelHeight | BitsPerPixel | Orientaion | ColorMode | HasAlpha |
GPS | ExifThumbnail | JFIF | IPTC | JPEG | ExifSubIFD | ExifIFD0 |
Interoperability |
+----------+----------+--------------+--------+----------+-----------+------------+-------------+--------------+------------+-----------+----------+-----+---------------+------+------+------+------------+----------+------------------+
| 4349313028_f69ffa0257_o.jpg | 257213 | 2016-02-01 09:00:56.0 | JPEG | 96.0 |
96.0 | 1199 | 800 | 24 | 0 | RGB | false |
{"GPSVersionID":[0,0,2,2],"GPSLatitudeRef":"N","GPSLatitude":47.53777313232332,"GPSLongitudeRef":"W","GPSLongitude":-122.03510284423795,"GPSAltitudeRef":0,"GPSAltitude":0.0}
|
{"ThumbnailCompression":6,"XResolution":72.0,"YResolution":72.0,"ResolutionUnit":2,"ThumbnailOffset":414,"ThumbnailLength":7213}
| {"Version":257,"ResolutionUnits":1,"XResolution":96,"YResolution":96} |
{"Keywords":["135","2002","issaquah","police car","wa","washington"]} |
{"CompressionType":0,"DataPrecision":8,"ImageHeight":800,"ImageWidth":1199,"NumberOfComponents":3,"Component1":{"ComponentId":1,"HorizontalSamplingFactor":2,"VerticalSamplingFactor":2,"QuantizationTableNumber":0},"Component2":{"ComponentId":2,"HorizontalSamplingFactor":1,"VerticalSamplingFactor":1,"QuantizationTableNumber":1},"Component3":{"ComponentId":3,"HorizontalSamplingFactor":1,"VerticalSamplingFactor":1,"QuantizationTableNumber":1}}
|
{"ExifVersion":"MDIxMA==","UniqueImageID":"d65e93b836d15a0c5e041e6b7258c76e"} |
{"Software":"Picasa 3.0"} | {"InteroperabilityIndex":"
","InteroperabilityVersion":"MDEwMA=="} |
+----------+----------+--------------+--------+----------+-----------+------------+-------------+--------------+------------+-----------+----------+-----+---------------+------+------+------+------------+----------+------------------+
1 row selected (0.444 seconds)
{noformat}
Retrieving GPS location data from the Exif metadata for the use of GIS functions
{noformat}
0: jdbc:drill:zk=local> select t.GPS.GPSLatitude, t.GPS.GPSLongitude from
dfs.`4349313028_f69ffa0257_o.jpg` t;
+--------------------+----------------------+
| EXPR$0 | EXPR$1 |
+--------------------+----------------------+
| 47.53777313232332 | -122.03510284423795 |
+--------------------+----------------------+
1 row selected (0.256 seconds)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)