Wow thank you Frank, it was far kind of you to type that all out!
Daniel Wolf From: [email protected] [mailto:[email protected]] On Behalf Of Frank Ress Sent: Tuesday, June 28, 2016 2:03 PM To: [email protected] Subject: RE: [NTSysADM] Compare two large lists I’m a SQL guy (SQL Server of late). You can download and install the Developer Edition – it’s free. You really don’t need any of the add-on services (Analysis Services, Reporting Services, Integration Services…). Just install the database itself and the management tools. Bing/Microsoft will help answer any questions you might have regarding installation. Once it’s running, open Management Studio. Connect to your new instance and expand the navigator pane to see the databases. You’ll have 4 system databases by default – SYSTEM, MODEL, MSDB, and TEMP. Create a new database (right-click the ‘Databases’ node, New Database, etc. – defaults for file names and locations will be fine). Name it whatever you’d like. Once the database is created, right-click that database in the navigator pane and pick ‘Tasks’/Import Data… Assuming you have the hashes in a spreadsheet or whatever, just import both lists each into its own table using the import wizard. Name the tables whatever you’d like, e.g. BigList and SmallList. You’ll also give the columns with the data a name (let’s assume you have no other columns of info for each table, just the hashes). You can give them the same name, but it’s easier if they’re unique. Call them BigListHash and SmallListHash, for example. Once the tables are created and populated, right-click your database in the navigator again and select ‘New Query’. A new editing pane will open to the right of the navigator. In the query pane, enter: SELECT DISTINCT SmallListHash FROM SmallList INNER JOIN BigList ON SmallListHash = BigListHash You don’t need the ‘DISTINCT’ operator if there are no duplicate hash values in your lists. The query would perform better without it, but using it will eliminate any dups that exist in the data. Other than speed, can’t hurt to have it. There are buttons on the toolbar that will let you export the results to a text file, if you’d like. HTH Frank Ress Gas Technology Institute From: [email protected] <mailto:[email protected]> [mailto:[email protected]] On Behalf Of Richard Stovall Sent: Tuesday, June 28, 2016 1:03 PM To: [email protected] <mailto:[email protected]> Subject: [NTSysADM] Compare two large lists Not necessarily Windows-related. I need to compare a list of about 300,000 file hashes against a larger list of ~30,000,000 and find ones that are represented in both data sets. I'm not a database guy, nor have I ever played one on TeeVee. Any ideas about how to go about this with standard/free tools in Windows or Linux? TIA, RS _____ This communication is for the use of the intended recipient only. It may contain information that is privileged and confidential. If you are not the intended recipient of this communication, the disclosure, copying, distribution or use hereof is prohibited. If you have received this communication in error, please advise me by return e-mail or by telephone and then delete it immediately.
smime.p7s
Description: S/MIME cryptographic signature

