Thanks for the explanation!
On Tue, Jul 21, 2015 at 4:13 PM, Henry Rich <[email protected]> wrote: > <._2855392203 + 2^32 > 1439575093 > > The upper bits of the CRC-32 should be discarded: > > 4bminus1 =. (26 b.) 32 (33 b.) _32 (33 b.) _1 > fourbminus1 (17 b.) f 'assiduously avoid any and all asinine > alliterations' > 1439575093 > > Henry Rich > > On 7/21/2015 3:06 PM, Vijay Lulla wrote: >> >> Out of curiosity, I'm getting different value for the example listed >> under 128!:3. Shouldn't it be the same as listed on the page? >> >> Below is from my J session >> >> f '123456789' >> _873187034 >> f 'assiduously avoid any and all asinine alliterations' NB. >> Different from the listed example >> _2855392203 >> JVERSION >> Engine: j803/2014-10-19-11:11:11 >> Library: 8.04.06 >> Qt IDE: 1.4.3/5.4.2 >> Platform: Win 64 >> Installer: J804 install >> InstallPath: h:/utilities/j64-804 >> >> >> On Tue, Jul 21, 2015 at 11:48 AM, Raul Miller <[email protected]> >> wrote: >>> >>> You can't have an inverse crc, because crc is a lossy transformation. >>> You are basically relying on statistics to avoid collisions (different >>> strings with the same crc). >>> >>> So actual use would look something like: >>> >>> step one: get the distinct crcs which are in use. >>> >>> step two: go over the data again and for each string find its crc, and >>> check that some other relevant string isn't producing the same crc. >>> (If there are, you'll need further work to untangle them.) >>> >>> -- >>> Raul >>> >>> On Tue, Jul 21, 2015 at 10:34 AM, Mike Day <[email protected]> >>> wrote: >>>> >>>> That's neat, but it's a bit messy retrieving the actual >>>> substrings rather than their encoded forms. >>>> >>>> This does it, >>>> 10(]{~i.@[+/~((I.@:(1<#/.~))@:( (128!:3)\ ]))) s >>>> >>>> AAAAACCCCC >>>> >>>> CCCCCAAAAA >>>> >>>> >>>> but it would be much better with an inverse CRC; >>>> however that doesn't seem to be supported in J. >>>> >>>> >>>> Is there a maximum window size for this approach? >>>> >>>> Thanks, >>>> >>>> Mike >>>> >>>> >>>> On 21/07/2015 14:37, Henry Rich wrote: >>>>> >>>>> >>>>> For longer subsequences consider using >>>>> >>>>> (10 (128!:3)\ ]) >>>>> >>>>> to reduce the size of the intermediate array. >>>>> >>>>> Henry Rich >>>>> >>>>> On 7/21/2015 12:49 AM, Vijay Lulla wrote: >>>>>> >>>>>> >>>>>> Using slightly less space >>>>>> >>>>>> (~. #~ 1 < #/.~)@(10 ]\ ]) s >>>>>> >>>>>> On Mon, Jul 20, 2015 at 11:59 PM, Tikkanz <[email protected]> wrote: >>>>>>> >>>>>>> >>>>>>> (i.~ ~: i:~) will find duplicates so how about: >>>>>>> >>>>>>> ~.@(#~ i.~ ~: i:~)@(10 ]\ ]) s >>>>>>> >>>>>>> AAAAACCCCC >>>>>>> >>>>>>> CCCCCAAAAA >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Jul 21, 2015 at 3:51 PM, Jon Hough <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> This is a problem from leetcode.com (similar to Project Euler) >>>>>>>> https://leetcode.com/problems/repeated-dna-sequences/ >>>>>>>> The problem is to find all 10 letter repeated subsequences from a >>>>>>>> DNA >>>>>>>> string (made of C,G,A,T characters). >>>>>>>> My solution: >>>>>>>> func =: (I.@:(1&<)@:>@:(1&{)@:(~. ,: <"0@:(#/.~)) { >>>>>>>> ])@:(<"1@:(10&(]\))) >>>>>>>> e.g. s =: 'AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT' NB. see the link for >>>>>>>> this >>>>>>>> definition >>>>>>>> func s >>>>>>>> ┌──────────┬──────────┐ >>>>>>>> >>>>>>>> │AAAAACCCCC│CCCCCAAAAA│ >>>>>>>> >>>>>>>> └──────────┴──────────┘ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> It is not very pretty. Can anyone improve on it? >>>> >>>> >>>> >>>> >>>> --- >>>> This email has been checked for viruses by Avast antivirus software. >>>> https://www.avast.com/antivirus >>>> >>>> >>>> ---------------------------------------------------------------------- >>>> For information about J forums see http://www.jsoftware.com/forums.htm >>> >>> ---------------------------------------------------------------------- >>> For information about J forums see http://www.jsoftware.com/forums.htm >> >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm >> > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
