I don't know specifically what you refer to as data normalization. My
guess is something like this. But it is irrelevant to my case.

https://www.studytonight.com/dbms/database-normalization.php

For my specific TSV file, it has about 50 million rows and just two
columns. The first column is of strings and the second column is of
integers. All the strings in the first column are unique (some strings
may be substrings of other strings though).

On 4/10/19, Hick Gunter <h...@scigames.at> wrote:
> I have the distinct impression that you are attempting to convert a flat
> file into a naked table and pretending that the result is a (relational)
> database.
>
> Please rethink your approach. There is a design process called
> "normalization" that needs to be done first. This will identify "entities"
> (with "attributes") and "relations" that will greatly reduce data
> duplication found in flat files.

-- 
Regards,
Peng
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to