> > > You might reevaluate if you're using the right tool for the job. > > That's my question: IS sqlite the right tool here? =) > Then I guess the right question is what are your goals? To make > maintenance easier? > Why were the thousands of files a problem?
Short answer: I want to improve - data handling - data consistency and correctness - data extraction Each single file contains detailed genotypic information of many individuals at a given genomic region. We have to implement _loads_ of quality control measures to ensure the maximum possible data correctness. Earlier, we did this manually. We can't do this any longer. Handling that many files is nightmare, especially if a couple of different individuals are involved. For example, it is horrible to extract subsets from that mess (a few markers from a couple of individuals is a major problem with many, many files, but easily solved in a relational db).