Are you asking about a simple JOIN?

Below I have a simple list of country_data.csv that I’m joining to an iso
dataset:

0: jdbc:drill:> select * from maprfs.cmatta.`iso_3166_2_countries.csv` limit 10;
+---------+
| columns |
+---------+
| ["Sort Order","Common Name","Formal Name","Type","Sub
Type","Sovereignty","Capital","ISO 4217 Currency Code","ISO 4217
Currency Name","ITU-T Telephone Code","ISO 3166-1 2 Letter Code","ISO
3166-1 3 Letter Code","ISO 3166-1 Number","IANA Country Code TLD\r"] |
| ["1","Afghanistan","Islamic State of Afghanistan","Independent
State","","","Kabul","AFN","Afghani","93","AF","AFG","4",".af\r"] |
| ["2","Albania","Republic of Albania","Independent
State","","","Tirana","ALL","Lek","355","AL","ALB","8",".al\r"] |
| ["3","Algeria","People's Democratic Republic of
Algeria","Independent
State","","","Algiers","DZD","Dinar","213","DZ","DZA","12",".dz\r"] |
| ["4","Andorra","Principality of Andorra","Independent
State","","","Andorra la
Vella","EUR","Euro","376","AD","AND","20",".ad\r"] |
| ["5","Angola","Republic of Angola","Independent
State","","","Luanda","AOA","Kwanza","244","AO","AGO","24",".ao\r"] |
| ["6","Antigua and Barbuda","","Independent State","","","Saint
John's","XCD","Dollar","-267","AG","ATG","28",".ag\r"] |
| ["7","Argentina","Argentine Republic","Independent
State","","","Buenos Aires","ARS","Peso","54","AR","ARG","32",".ar\r"]
|
| ["8","Armenia","Republic of Armenia","Independent
State","","","Yerevan","AMD","Dram","374","AM","ARM","51",".am\r"] |
| ["9","Australia","Commonwealth of Australia","Independent
State","","","Canberra","AUD","Dollar","61","AU","AUS","36",".au\r"] |
+---------+
10 rows selected (0.905 seconds)
0: jdbc:drill:> select * from maprfs.cmatta.`country_data.csv` limit 10;
+--------------------------------------------------------------------+
|                              columns                               |
+--------------------------------------------------------------------+
| ["1","Kabul","AFN","Afghani","93","AF","AFG","4",".af"]            |
| ["2","Tirana","ALL","Lek","355","AL","ALB","8",".al"]              |
| ["3","Algiers","DZD","Dinar","213","DZ","DZA","12",".dz"]          |
| ["4","Andorra la Vella","EUR","Euro","376","AD","AND","20",".ad"]  |
| ["5","Luanda","AOA","Kwanza","244","AO","AGO","24",".ao"]          |
+--------------------------------------------------------------------+
5 rows selected (0.35 seconds)
0: jdbc:drill:> select c.`columns`[1] as capital, i.`columns`[1] as
country from maprfs.cmatta.`country_data.csv` c JOIN
maprfs.cmatta.`iso_3166_2_countries.csv` i ON c.`columns`[7] =
i.`columns`[12];
+-------------------+--------------+
|      capital      |   country    |
+-------------------+--------------+
| Kabul             | Afghanistan  |
| Tirana            | Albania      |
| Algiers           | Algeria      |
| Andorra la Vella  | Andorra      |
| Luanda            | Angola       |
+-------------------+--------------+
5 rows selected (0.82 seconds)
0: jdbc:drill:>

What do you hope to accomplish with Pig? A lot of manipulation of data can
be accomplished with Drill and SQL.

On Tuesday, January 5, 2016, Peder Jakobsen | gmail <[email protected]>
wrote:

Does drill have something analogous  to  SQL  CONSTRAINT and REFERENCES for
> doing table type  lookups?
>
> I have a large data that uses numerical ISO country codes; they need to be
> replaced with country names that are mapped in a small CSV file (iso_code,
> country_name).
>
> Any suggestions?
>
> Thanks you,
>
> Peder
>
> P.S. Is it worth learning Pig Latin or something similar for more complex
> queries and data manipulation in Drill?
>
​

Reply via email to