Hi, I just spent the afternoon getting to know the array design and raw data import into BASE2 - starting with genepix format - and have come across a few things.
I'm using BASE 2.2.2 (build #3172; schema #30). I have looked through the fixes for 2.2.3 and decided not to upgrade - otherwise I'll just spend my whole life upgrading BASE... ;-) 1. In the "view files" page, the "type" menu has a blank entry for 'raw data' although this still seems to work. This might be fixed in 2.2.3 see http://base.thep.lu.se/ticket/559 which looks related. 2. I think there's some inconsistent handling of trailing spaces in the "reporter ID" column of a genepix .gpr file. For example I can import reporters, and create an array design from the file pasted below, but I can't then import the raw data! (the following is just 8 lines long - if the long lines get mangled, I'll send a copy by mail on request) ATF 1.0 27 43 Type=GenePix Results 1.4 "Block" "Column" "Row" "Name" "ID" "X" "Y" "Dia." "F635 Median" "F635 Mean" "F635 SD" "B635 Median" "B635 Mean" "B635 SD" "% > B635+1SD" "% > B635+2SD" "F635 % Sat." "F532 Median" "F532 Mean" "F532 SD" "B532 Median" "B532 Mean" "B532 SD" "% > B532+1SD" "% > B532+2SD" "F532 % Sat." "Ratio of Medians" "Ratio of Means" "Median of Ratios" "Mean of Ratios" "Ratios SD" "Rgn Ratio" "Rgn R²" "F Pixels" "B Pixels" "Sum of Medians" "Sum of Means" "Log Ratio" "F635 Median - B635" "F532 Median - B532" "F635 Mean - B635" "F532 Mean - B532" "Flags" 1 1 1 "demoA" "demorep1" 1690 5730 110 183 181 42 59 62 25 100 98 0 276 270 48 64 65 13 100 100 0 0.585 0.592 0.570 0.576 1.357 0.591 0.782 80 621 336 328 -0.774 124 212 122 206 0 1 2 1 "demoB" "demorep2 " 1910 5730 120 114 137 175 57 61 37 71 21 0 346 341 80 63 65 35 96 95 0 0.201 0.288 0.192 0.209 2.379 0.398 0.094 120 716 340 358 -2.312 57 283 80 278 0 1 3 1 "demoC" "demorep3" 2110 5740 110 145 148 43 63 68 30 92 68 0 208 214 48 69 74 43 98 93 0 0.590 0.586 0.599 0.541 1.987 0.504 0.582 80 566 221 230 -0.761 82 139 85 145 0 1 4 1 "demoD" "demorep4" 2300 5730 110 185 187 51 59 63 23 100 96 0 298 294 57 64 67 24 100 98 0 0.538 0.557 0.526 0.538 1.599 0.549 0.730 80 590 360 358 -0.893 126 234 128 230 0 the stacktrace from the raw data import is: net.sf.basedb.core.BaseException: Item not found: Reporter mismatch: The feature has reporter 'demorep2' whereas you have given 'demorep2 ' on line 6: 1 2 1 "demoB" "de... at net.sf.basedb.plugins.AbstractFlatFileImporter.doImport(AbstractFlatFileImporter.java:592) at net.sf.basedb.plugins.AbstractFlatFileImporter.run(AbstractFlatFileImporter.java:442) at net.sf.basedb.core.PluginExecutionRequest.invoke(PluginExecutionRequest.java:88) at net.sf.basedb.core.InternalJobQueue$JobRunner.run(InternalJobQueue.java:420) at java.lang.Thread.run(Thread.java:619) Caused by: net.sf.basedb.core.ItemNotFoundException: Item not found: Reporter mismatch: The feature has reporter 'demorep2' whereas you have given 'demorep2 ' at net.sf.basedb.core.RawDataBatcher.doInsert(RawDataBatcher.java:390) at net.sf.basedb.core.RawDataBatcher.insert(RawDataBatcher.java:343) at net.sf.basedb.plugins.RawDataFlatFileImporter.handleData(RawDataFlatFileImporter.java:544) at net.sf.basedb.plugins.AbstractFlatFileImporter.doImport(AbstractFlatFileImporter.java:570) ... 4 more I think BASE1 was more tolerant. 3. case sensitivity in the reporter ID (external id) column I get "Error: Duplicate entry 'demoBLANK' for key 2" if I import reporters from this file: ATF 1.0 27 43 Type=GenePix Results 1.4 "Block" "Column" "Row" "Name" "ID" "X" "Y" "Dia." "F635 Median" "F635 Mean" "F635 SD" "B635 Median" "B635 Mean" "B635 SD" "% > B635+1SD" "% > B635+2SD" "F635 % Sat." "F532 Median" "F532 Mean" "F532 SD" "B532 Median" "B532 Mean" "B532 SD" "% > B532+1SD" "% > B532+2SD" "F532 % Sat." "Ratio of Medians" "Ratio of Means" "Median of Ratios" "Mean of Ratios" "Ratios SD" "Rgn Ratio" "Rgn R²" "F Pixels" "B Pixels" "Sum of Medians" "Sum of Means" "Log Ratio" "F635 Median - B635" "F532 Median - B532" "F635 Mean - B635" "F532 Mean - B532" "Flags" 1 1 1 "demoA" "demorep1" 1690 5730 110 183 181 42 59 62 25 100 98 0 276 270 48 64 65 13 100 100 0 0.585 0.592 0.570 0.576 1.357 0.591 0.782 80 621 336 328 -0.774 124 212 122 206 0 1 2 1 "demoB" "demorep2" 1910 5730 120 114 137 175 57 61 37 71 21 0 346 341 80 63 65 35 96 95 0 0.201 0.288 0.192 0.209 2.379 0.398 0.094 120 716 340 358 -2.312 57 283 80 278 0 1 3 1 "demoblank" "demoblank" 2110 5740 110 145 148 43 63 68 30 92 68 0 208 214 48 69 74 43 98 93 0 0.590 0.586 0.599 0.541 1.987 0.504 0.582 80 566 221 230 -0.761 82 139 85 145 0 1 4 1 "demoBLANK" "demoBLANK" 2300 5730 110 185 187 51 59 63 23 100 96 0 298 294 57 64 67 24 100 98 0 0.538 0.557 0.526 0.538 1.599 0.549 0.730 80 590 360 358 -0.893 126 234 128 230 0 However in BASE1 it was possible to import files with problems like this. For example, see http://base.vectorbase.org/raw_edit.phtml?i_r=102 (just click ok to log in) you can compare the imported .gpr file (scroll down to 2 17 20) with the table of data (position 857) you see that "BLANK" was imported as "Blank" because "Blank" was already in the table. Tomorrow I'll see how far I get with fixing the input files. Ideally I want to be able to continue to import raw data into BASE2 linked to array designs that were migrated from BASE1. We have to fix all kinds of stuff in the files anyway so I don't think that should be too much of a problem. 4. It doesn't seem possible to "un-import" raw data (in order to reimport it after fixing some annoying typo). The same seems to be true of array designs (can't reimport reporter maps). 5. There doesn't seem to a record of which file was used to import features into an array design (this has been discussed on the list recently I think). cheers, Bob. -- Bob MacCallum | VectorBase Developer | Kafatos/Christophides Groups | Division of Cell and Molecular Biology | Imperial College London | Phone +442075941945 | Email [EMAIL PROTECTED] ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ The BASE general discussion mailing list basedb-users@lists.sourceforge.net unsubscribe: send a mail with subject "unsubscribe" to [EMAIL PROTECTED]