GitHub user k255 opened a pull request:
https://github.com/apache/drill/pull/335
DRILL-4303: ESRI Shapefile (shp) format plugin
Shp format plugin. Main idea is to read shapefiles for joining with other
sources or enabling the conversion to i.e. parquet file which is capable of
storing geometry data in binary format (WKT) on hdfs.
The implementation is based on esri java lib which lets to parse single
geometry definition. Custom code is written to read whole file
(ShapefileByteBufferCursor). The plugin also handles reading of accompanying
data file (dbf) and srid informations (srid).
Sample usage:
- reading shp
```select *, ST_AsText(geom) from cp.`sample-data/CA-cities.shp`;```
- conversion to parquet
```alter session set `store.format`='parquet';```
```create table dfs.tmp.`/CA-cities-par` as select * from
cp.`sample-data/CA-cities.shp`;```
There is also sample parquet file in cp.`sample-data/CA-cities.parquet`
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/k255/drill drill-gis-shp
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/drill/pull/335.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #335
----
commit ecaa6ff5303cd179cc0c0f96518b1ee69ff40955
Author: potocki <[email protected]>
Date: 2016-01-22T11:21:04Z
ESRI Shapefile (shp) reader implemented as drill format plugin
commit 91ccd1ccf0d06802dcf0da2ee1ef83c903c248af
Author: potocki <[email protected]>
Date: 2016-01-22T12:19:00Z
added sample file in parquet format
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---