Re: [sqlite] Populating dyadic dataset

John Machin Mon, 01 Jun 2009 16:55:35 -0700

On 2/06/2009 8:07 AM, Vincent Arel wrote:
> Hi everyone,
> 
> I'm very, very new to SQLite, and would appreciate any help I can get.


Unless I'm very very confused, this has very little to do with SQL at 
all (let alone SQLite) apart from using an INSERT statement to dispose 
of the final product.

You might be better off asking on a forum related to whatever language 
you are using.

Your problem description is a tad ambiguous, so let's see if we can get 
that improved before we send you off elsewhere.

Let's guess that you ultimately want to record two-way trade statistics 
for all possible1 country pairs for all possible2 years, for some 
definition of possible1 and possible2.

> 
> I have 3 long vectors that look like this:
> {"ALB","CAN", "DZA",...}
> {"ALB","CAN", "DZA",...}
> {"1961","1962", "1963",...}
> 
> And I want to create a table that looks like this:
> 
> ID    Var1    Var2    Var3

Using meaningful names is strongly suggested ... e.g. from_country, 
to_country, trade_year


> 1    ALB    CAN    1961
> 1    ALB    CAN    1962
> 1    ALB    CAN    1963
> 2    ALB    DZA    1961
> 2    ALB    DZA    1962
> 2    ALB    DZA    1963
> 3    CAN    ALB    1961
> 3    CAN    ALB    1962
> 3    CAN    ALB    1963
> 4    CAN    DZA    1961
> 4    CAN    DZA    1962
> 4    CAN    DZA    1963
> 5    DZA    ALB    1961
> 5    DZA    ALB    1962
> 5    DZA    ALB    1963
> 6    DZA    CAN    1961
> 6    DZA    CAN    1962
> 6    DZA    CAN    1963
> 
> In short, I need to include every possible pair of Var1/Var2 values (where
> Var1/Var2 != Var2/Var1.

Huh? Var1/Var2 == Var2/Var1 iff var1 == var2 ... if you mean "where var1 
!= var2" why not say so? If you mean something else (possible since you 
later mention /also/ dropping rows where var1 == var2) then please explain.

> I want to keep permutations.)

What does that mean?

> For each of these
> pairs, I need to create separate rows for each different value of Var3.

What if there are gaps in your third vector? Should they be filled in?

> I also need to drop rows where Var1 == Var2.

See above.

 > Finally, I would like to generate
> a unique ID number for each Var1/Var2 pair.

Aha! A vague connection with SQL :-) Why bother? You require the (var1, 
var2) tuple to be unique anyway ... what's the point of having another 
(redundant) column?

> Of course, I do not expect a ready-made answer. However, if some of you
> could tell me where I should start looking for a solution to my problem, or
> if you have any conceptual programming hints that could help me produce the
> needed table, I would be extremely grateful.

Expressing the "hints" in an expressive language like Python, and 
assuming the widest definitions of "possible", but basing it on your data:

var1 = ["ALB", "CAN", "DZA", ...]
var2 = ["ALB", "CAN", "DZA", ...]
var3 = ["1961", "1962", "1963",...]
# get a list of unique country codes, in sorted order
countries = list(set(var1 + var2))
countries.sort()
# convert years to integer, find range
var3int = [int(y) for y in var3]
firsty = min(var3int)
lasty = max(var3int)
year_range = range(firsty, lasty + 1)
# do the business
id = 0
ncountries = len(countries)
for i in range(ncountries - 1):
     for j in range(i + 1, ncountries):
        assert countries[i] != countries[j]
         id += 1
         for year in year_range:
             print id, countries[i], countries[j], year

BTW, this is all keys ... where's the data?

HTH,
John
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Populating dyadic dataset

Reply via email to