Hi Sunanda, Your sort worked perfectly. Thanks also for the explanation. You might be interested to know that on my 450 Pentium 2 running w2k total time to sort 29,688 lines using your code was:
3:50:10 I want to thank Sunanda and everyone else that helped with this. I still have questions, but I'm committed to other things for the rest of this week. Next week I'll try to ask them. One question I'll ask now however: What exactly does hash do, and could hash be used to speed up sort? Thanks again, Louis At 05:50 PM 9/3/2002 -0400, you wrote: >Louis: > > How do I sort the following lines by > > the numbers only (not by the letters and not by the length). > >Hope this helps..... > >Let me first simplify the data to make the code snippets easier to follow: > >data: [ > "1 first of the ones" > "1 last of the ones" > "3 top three" > "3 middle three" > "3 bottom three" > "2 start of the twos" > "2 middle of the twos" > "2 last of the twos" > ] > >You want a *stable* sort on the first field. End result (on my data) should >be: > >sorted-data: [ > "1 first of the ones" > "1 last of the ones" > "2 start of the twos" > "2 middle of the twos" > "2 last of the twos" > "3 top three" > "3 middle three" > "3 bottom three" > ] > > >A straight sort will compare the whole line length. So > > sort sorted-data: copy data > >doesn't preserve the input sequencing for fields with an equal key. > >If Rebol's sort were "stable" -- i.e. it kept equal keys in their >original sequence, then all you'd need to do is to write your own >sort compare routine to compare partial keys. > >This is the basic code: > > sort/compare sorted-data: copy data func [a b] [return a < b] > >a and b (you can use any names) are the pairs of records to be compared. > >But the code above adds no value to the basic sort, as we want to >sort on the first character of each record (on my data). So: > > sort/compare sorted-data copy data func [a b] [ > return (copy/part a 1) < (copy/part b 1)] > > >It's not *quite* what *you* want, as you need to sort on the >first space-deliminated field. One way is to use 'parse: > > sort/compare sorted-data copy data func [a b] [ > return (first parse a " ") > (first parse b " ") > ] > >But this doesn't preserve the input sequence -- it looks to me like >Rebol's sort is ***not*** stable. So we need to add the input sequence >to each key as part of the compare. Like this: > > sort/compare sorted-data: copy data func [a b /local a-key b-key ] [ > a-key: join first parse/all a " " ["-" index? find data a] > b-key: join first parse/all b " " ["-" index? find data b] > ;;; print [a-key " " b-key] ;;-- decomment this to see what's >happening > return a-key < b-key > ] > >This is now a stable sort on my data. > >It's not quite what you want because your data starts with a space. >So you need the **second** field in the parse: > > sort/compare sorted-data: copy data func [a b /local a-key b-key ] [ > a-key: join second parse/all a " " ["-" index? find data a] > b-key: join second parse/all b " " ["-" index? find data b] > print [a-key " " b-key] ;;-- decomment this to see what's >happening > return a-key < b-key > ] > > >It should now work on your data, as long >as all your integers remain all as three digits. >If you start to use unequal length >integers, you'll need to normalise them in the sort key. > >For your data, I get (I'm assuming you have it in a block as a >string-per-line): > >print mold data >[ > " 454 en tw" > " 395 en th" > " 313 kai o" > " 175 oi de" > " 314 eij thn" > " 174 eij ton" > " 124 kai ouk" > " 123 kai thn" > " 219 ek tou" > " 160 kai en" > " 142 kai to" > " 126 tw qew" > " 166 kai h" > " 096 ei de" > " 094 ou mh" > " 091 ei mh" > " 120 thj ghj" > " 120 en toij" > " 112 estin o" > " 108 en taij" > " 118 o kurioj" > " 096 proj ton" > " 088 kai touj" > " 082 kai idou" > " 115 kai ta" > " 111 o uioj" > " 111 de kai" > " 103 ek twn" > " 114 eipen autoij" > " 104 tou anqrwpou" > " 071 legei autoij" > " 063 twn ioudaiwn" > " 105 proj auton" > " 092 oi maqhtai" > " 082 legei autw" > " 078 eipen autw" > " 103 ek thj" > " 090 ina mh" > " 086 autw o" > " 084 ou gar" > " 101 apo tou" >] > > > >> print mold sorted-data >[ > "063 twn ioudaiwn" > "071 legei autoij" > "078 eipen autw" > "082 kai idou" > "082 legei autw" > "084 ou gar" > "086 autw o" > "088 kai touj" > "090 ina mh" > "091 ei mh" > "092 oi maqhtai" > "094 ou mh" > "096 ei de" > "096 proj ton" > "101 apo tou" > "103 ek twn" > "103 ek thj" > "104 tou anqrwpou" > "105 proj auton" > "108 en taij" > "111 o uioj" > "111 de kai" > "112 estin o" > "114 eipen autoij" > "115 kai ta" > "118 o kurioj" > "120 thj ghj" > "120 en toij" > "123 kai thn" > "124 kai ouk" > "126 tw qew" > "142 kai to" > "160 kai en" > "166 kai h" > "174 eij ton" > "175 oi de" > "219 ek tou" > "313 kai o" > "314 eij thn" > "395 en th" > "454 en tw"] > >Some of the lines it makes a difference on are the 111s and the 120s -- >they would be swapped with a simple sort, or randomised within themselves by >a non-stable one. > >Hope that makes sense, > >Sunanda. >-- >To unsubscribe from this list, please send an email to >[EMAIL PROTECTED] with "unsubscribe" in the >subject, without the quotes. -- To unsubscribe from this list, please send an email to [EMAIL PROTECTED] with "unsubscribe" in the subject, without the quotes.
