There's a little bit more involved than just consolidating the files into one that I need. Specifically, since the command line on all customer linux machines are formatted a certain way, I can easily identify what machine I'm specifically looking at, and filter results based on that. Because I'm looking at around 1000 files and 600meg of raw text, it isn't easily consumable to a human, so, the only relevant identifiers I need right now is the start date (Based on file name), end date (Based on file time stamp), and the server itself (Which can be a mix of servers in a single log file), which I said, is part of the command line. The removal of the duplication of each line in that text file is also important to me as well as there is no reason to have 500 entries of my initiating the SSH session to a customer site.
With exactly zero code written and not a database file created, I'm planning on having log entry table with an auto numbered PK field, text field, and a hash of the text field (Collisions may occur when [ new.text <> old.text and new.hash = old.hash ] but I'm not interested in 100% absolutely perfect results). I'll then have a table that contains just a list of server names picked up from reading the log files with a PK field as well, and then one table that contains an auto numbered PK, reference to the server PK, and a reference to the PK in the log entry table, and wiggle in the date mentioned in the filename as well. On Thu, Feb 2, 2017 at 11:59 AM, Simon Slavin <slav...@bigfraud.org> wrote: > > > Under those circumstances, all you’re really doing by putting this data in > a SQLite database is consolidating lots of separate files into one. So > import everything from those files, without checking for duplicates, using > as your primary key the combination of logfile name and line number. To > avoid having to deal with errors from duplicates use > > INSERT OR IGNORE ... > > Once you’ve worked out how to get it into a SQLite database you can decide > whether do searches using LIKE or FTS. Or duplicate your database and > experiment with both approaches to find your ideal balance of filesize and > search speed. > > Simon. > _______________________________________________ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users