[
https://issues.apache.org/jira/browse/DRILL-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15112388#comment-15112388
]
ASF GitHub Bot commented on DRILL-4303:
---------------------------------------
GitHub user k255 opened a pull request:
https://github.com/apache/drill/pull/335
DRILL-4303: ESRI Shapefile (shp) format plugin
Shp format plugin. Main idea is to read shapefiles for joining with other
sources or enabling the conversion to i.e. parquet file which is capable of
storing geometry data in binary format (WKT) on hdfs.
The implementation is based on esri java lib which lets to parse single
geometry definition. Custom code is written to read whole file
(ShapefileByteBufferCursor). The plugin also handles reading of accompanying
data file (dbf) and srid informations (srid).
Sample usage:
- reading shp
```select *, ST_AsText(geom) from cp.`sample-data/CA-cities.shp`;```
- conversion to parquet
```alter session set `store.format`='parquet';```
```create table dfs.tmp.`/CA-cities-par` as select * from
cp.`sample-data/CA-cities.shp`;```
There is also sample parquet file in cp.`sample-data/CA-cities.parquet`
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/k255/drill drill-gis-shp
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/drill/pull/335.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #335
----
commit ecaa6ff5303cd179cc0c0f96518b1ee69ff40955
Author: potocki <[email protected]>
Date: 2016-01-22T11:21:04Z
ESRI Shapefile (shp) reader implemented as drill format plugin
commit 91ccd1ccf0d06802dcf0da2ee1ef83c903c248af
Author: potocki <[email protected]>
Date: 2016-01-22T12:19:00Z
added sample file in parquet format
----
> ESRI Shapefile (shp) format plugin
> ----------------------------------
>
> Key: DRILL-4303
> URL: https://issues.apache.org/jira/browse/DRILL-4303
> Project: Apache Drill
> Issue Type: Improvement
> Components: Storage - Other
> Reporter: Karol Potocki
>
> Allow Drill (drill-gis) to read esri shapefiles, one of the most popular
> geospatial formats.
> Format described here:
> https://www.esri.com/library/whitepapers/pdfs/shapefile.pdf
> It consists of three files (prj - srid information, dbf - data fields, shp -
> geometry data)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)