Melvin: haha, yeah, it's a download from the Clark County, NV voterfile
website. It's just the format they send out to people who request the file.
I worked this summer doing QA on voterfile builds so I'm familiar with the
data. I thought it would be good stuff to start with.

But thank you for the code in how to fix that flat file. I'm half-teaching
myself SQL and building a database for some game data i have with the plan
to sit Tableau on top of it.

btw, you guys are really nice and helpful. I'm definitely coming back,
you're making me want to make some progress on that dashboard.

On Wed, Dec 7, 2016 at 8:58 PM, Melvin Davidson <melvin6...@gmail.com>
wrote:

>
>
> On Wed, Dec 7, 2016 at 8:25 PM, Adrian Klaver <adrian.kla...@aklaver.com>
> wrote:
>
>> On 12/07/2016 05:19 PM, metaresolve wrote:
>>
>>> Uh, yeah, it was a SELECT * from cc_20161207;
>>>
>>> I know, it was dumb. I didn't realize it would break it or at least run
>>> for
>>> a while. I tend to do things in small steps, run a query, check my
>>> results,
>>> then tweak.
>>>
>>> You're right, I wouldn't want to be viewing those million. so I guess I
>>> could just be doing the top 10000 rows to get a sense, then be doing my
>>> aggregate group bys/etc to get my final numbers? That's how I hacked
>>> Access
>>> to be my data analysis program until I got to Alteryx. But I was also
>>> never
>>> using files of 1M then.
>>>
>>>
>> FYI pgAdmin, if I remember correctly, has a setting that limits the
>> maximum number of rows that it will fetch at a time.
>>
>>
>>
>> --
>> Adrian Klaver
>> adrian.kla...@aklaver.com
>>
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
> With regard to John's comments
> >Postgres, on a properly scaled and tuned database server, can handle
> billions of records.
> >Obviously, doing something silly like querying all billions at once will
> never be fast, thats a lot of data to marshal and process.
>
> On a similar note, your table structure is not exactly optimized for a
> relational database.
> I see no primary key to identify each row. In addition, the election &
> vote columns
> belong in a separate table that is a child of cc_20161207, since they are
> repeating fields.
>
> eg:
> ALTER TABLE cc_20161207
>   ADD COLUMN cc_20161207_pk serial;
>
> UPDATE cc_20161207
>   SET cc_20161207_pk = nextval(cc_20161207_pk_seq)
> WHERE cc_20161207_pk IS NULL;
>
> ALTER TABLE cc_20161207
>   ALTER COLUMN cc_20161207_pk SET NOT NULL;
> ALTER TABLE cc_20161207
>   ADD CONSTRAINT cc_20161207 PRIMARY KEY (cc_20161207_pk);
>
>
> CREATE TABLE election_data (
> cc_20161207_pk bigint NOT NULL,
> election_id serial NOT NULL,
> election varchar(4),
> vote_type varchar(1)
> CONSTRAINT election_data_pk PRIMARY KEY (cc_20161207_pk, election_id)
> );
>
> --
> *Melvin Davidson*
> I reserve the right to fantasize.  Whether or not you
> wish to share my fantasy is entirely up to you.
>

Reply via email to