The first column is of strings ...

Do you mean a single string as in "KerfufledAllaHasbalah"
Or a "bunch of strings with some implied delimiter" such as 
"Kerfufled/Alla/Hasballah" where "/" is the separator between strings?

If the latter, the data needs to be normalized.

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.


>-----Original Message-----
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of Peng Yu
>Sent: Wednesday, 10 April, 2019 08:01
>To: SQLite mailing list
>Subject: Re: [sqlite] [EXTERNAL] compressed sqlite3 database file?
>
>I don't know specifically what you refer to as data normalization. My
>guess is something like this. But it is irrelevant to my case.
>
>https://www.studytonight.com/dbms/database-normalization.php
>
>For my specific TSV file, it has about 50 million rows and just two
>columns. The first column is of strings and the second column is of
>integers. All the strings in the first column are unique (some
>strings
>may be substrings of other strings though).
>
>On 4/10/19, Hick Gunter <h...@scigames.at> wrote:
>> I have the distinct impression that you are attempting to convert a
>flat
>> file into a naked table and pretending that the result is a
>(relational)
>> database.
>>
>> Please rethink your approach. There is a design process called
>> "normalization" that needs to be done first. This will identify
>"entities"
>> (with "attributes") and "relations" that will greatly reduce data
>> duplication found in flat files.
>
>--
>Regards,
>Peng
>_______________________________________________
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to