subject:"Re\: efficiently splitting up strings based on substrings"

Re: efficiently splitting up strings based on substrings

2009-09-06 Thread 7stud

On Sep 5, 5:29 pm, per perfr...@gmail.com wrote:
 On Sep 5, 7:07 pm, Rhodri James rho...@wildebst.demon.co.uk wrote:



  On Sat, 05 Sep 2009 23:54:08 +0100, per perfr...@gmail.com wrote:
   On Sep 5, 6:42 pm, Rhodri James rho...@wildebst.demon.co.uk wrote:
   On Sat, 05 Sep 2009 22:54:41 +0100, per perfr...@gmail.com wrote:
I'm trying to efficiently split strings based on what substrings
they are made up of.
i have a set of strings that are comprised of known substrings.
For example, a, b, and c are substrings that are not identical to each
other, e.g.:
a = 0 * 5
b = 1 * 5
c = 2 * 5

Then my_string might be:

my_string = a + b + c

i am looking for an efficient way to solve the following problem.
suppose i have a short
string x that is a substring of my_string.  I want to split the
string x into blocks based on
what substrings (i.e. a, b, or c) chunks of s fall into.

to illustrate this, suppose x = 00111. Then I can detect where x
starts in my_string
using my_string.find(x).  But I don't know how to partition x into
blocks depending
on the substrings.  What I want to get out in this case is: 00,
111.  If x were 00122,
I'd want to get out 00,1, 22.

is there an easy way to do this?  i can't simply split x on a, b, or c
because these might
not be contained in x.  I want to avoid doing something inefficient
like looking at all substrings
of my_string etc.

i wouldn't mind using regular expressions for this but i cannot think
of an easy regular
expression for this problem.  I looked at the string module in the
library but did not see
anything that seemd related but i might have missed it.

   I'm not sure I understand your question exactly.  You seem to imply
   that the order of the substrings of x is consistent.  If that's the
   case, this ought to help:

import re
x = 00122
m = re.match(r(0*)(1*)(2*), x)
m.groups()

   ('00', '1', '22') y = 00111
m = re.match(r(0*)(1*)(2*), y)
m.groups()

   ('00', '111', '')

   You'll have to filter out the empty groups for yourself, but that's
   no great problem.

   The order of the substrings is consistent but what if it's not 0, 1, 2
   but a more complicated string? e.g.

   a = 1030405, b = 1babcf, c = fUUIUP

   then the substring x might be 4051ba, in which case using a regexp
   with (1*) will not work since both a and b substrings begin with the
   character 1.

  Right.  This looks approximately nothing like what I thought your
  problem was.  Would I be right in thinking that you want to match
  substrings of your potential substrings against the string x?

  I'm sufficiently confused that I think I'd like to see what your
  use case actually is before I make more of a fool of myself.

  --
  Rhodri James *-* Wildebeest Herder to the Masses

 it's exactly the same problem, except there are no constraints on the
 strings.  so the problem is, like you say, matching the substrings
 against the string x. in other words, finding out where x aligns to
 the ordered substrings abc, and then determine what chunk of x belongs
 to a, what chunk belongs to b, and what chunk belongs to c.

 so in the example i gave above, the substrings are: a = 1030405, b =
 1babcf, c = fUUIUP, so abc = 10304051babcffUUIUP

 given a substring like 4051ba, i'd want to split it into the chunks a,
 b, and c. in this case, i'd want the result to be: [405, 1ba] --
 i.e. 405 is the chunk of x that belongs to a, and 1ba the chunk
 that belongs to be. in this case, there are no chunks of c.  if x
 instead were 4051babcffUU, the right output is: [405, 1babcf,
 fUU], which are the corresponding chunks of a, b, and c that make up
 x respectively.

 i'm not sure how to approach this. any ideas/tips would be greatly
 appreciated. thanks again.


a = 1030405
b = 1babcf
c = fUUIUP
abc = 10304051babcffUUIUP
data = 4051babcffU

data_start = abc.find(data)
b_start = abc.find(b) - data_start
c_start = abc.find(c) - data_start

print data[:b_start]
print data[b_start:c_start]
print data[c_start:]

--output:--
405
1babcf
fU

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: efficiently splitting up strings based on substrings

2009-09-06 Thread 7stud

On Sep 6, 1:14 am, 7stud bbxx789_0...@yahoo.com wrote:
 On Sep 5, 5:29 pm, per perfr...@gmail.com wrote:



  On Sep 5, 7:07 pm, Rhodri James rho...@wildebst.demon.co.uk wrote:

   On Sat, 05 Sep 2009 23:54:08 +0100, per perfr...@gmail.com wrote:
On Sep 5, 6:42 pm, Rhodri James rho...@wildebst.demon.co.uk wrote:
On Sat, 05 Sep 2009 22:54:41 +0100, per perfr...@gmail.com wrote:
 I'm trying to efficiently split strings based on what substrings
 they are made up of.
 i have a set of strings that are comprised of known substrings.
 For example, a, b, and c are substrings that are not identical to 
 each
 other, e.g.:
 a = 0 * 5
 b = 1 * 5
 c = 2 * 5

 Then my_string might be:

 my_string = a + b + c

 i am looking for an efficient way to solve the following problem.
 suppose i have a short
 string x that is a substring of my_string.  I want to split the
 string x into blocks based on
 what substrings (i.e. a, b, or c) chunks of s fall into.

 to illustrate this, suppose x = 00111. Then I can detect where x
 starts in my_string
 using my_string.find(x).  But I don't know how to partition x into
 blocks depending
 on the substrings.  What I want to get out in this case is: 00,
 111.  If x were 00122,
 I'd want to get out 00,1, 22.

 is there an easy way to do this?  i can't simply split x on a, b, or 
 c
 because these might
 not be contained in x.  I want to avoid doing something inefficient
 like looking at all substrings
 of my_string etc.

 i wouldn't mind using regular expressions for this but i cannot think
 of an easy regular
 expression for this problem.  I looked at the string module in the
 library but did not see
 anything that seemd related but i might have missed it.

I'm not sure I understand your question exactly.  You seem to imply
that the order of the substrings of x is consistent.  If that's the
case, this ought to help:

 import re
 x = 00122
 m = re.match(r(0*)(1*)(2*), x)
 m.groups()

('00', '1', '22') y = 00111
 m = re.match(r(0*)(1*)(2*), y)
 m.groups()

('00', '111', '')

You'll have to filter out the empty groups for yourself, but that's
no great problem.

The order of the substrings is consistent but what if it's not 0, 1, 2
but a more complicated string? e.g.

a = 1030405, b = 1babcf, c = fUUIUP

then the substring x might be 4051ba, in which case using a regexp
with (1*) will not work since both a and b substrings begin with the
character 1.

   Right.  This looks approximately nothing like what I thought your
   problem was.  Would I be right in thinking that you want to match
   substrings of your potential substrings against the string x?

   I'm sufficiently confused that I think I'd like to see what your
   use case actually is before I make more of a fool of myself.

   --
   Rhodri James *-* Wildebeest Herder to the Masses

  it's exactly the same problem, except there are no constraints on the
  strings.  so the problem is, like you say, matching the substrings
  against the string x. in other words, finding out where x aligns to
  the ordered substrings abc, and then determine what chunk of x belongs
  to a, what chunk belongs to b, and what chunk belongs to c.

  so in the example i gave above, the substrings are: a = 1030405, b =
  1babcf, c = fUUIUP, so abc = 10304051babcffUUIUP

  given a substring like 4051ba, i'd want to split it into the chunks a,
  b, and c. in this case, i'd want the result to be: [405, 1ba] --
  i.e. 405 is the chunk of x that belongs to a, and 1ba the chunk
  that belongs to be. in this case, there are no chunks of c.  if x
  instead were 4051babcffUU, the right output is: [405, 1babcf,
  fUU], which are the corresponding chunks of a, b, and c that make up
  x respectively.

  i'm not sure how to approach this. any ideas/tips would be greatly
  appreciated. thanks again.

 a = 1030405
 b = 1babcf
 c = fUUIUP
 abc = 10304051babcffUUIUP
 data = 4051babcffU

 data_start = abc.find(data)
 b_start = abc.find(b) - data_start
 c_start = abc.find(c) - data_start

 print data[:b_start]
 print data[b_start:c_start]
 print data[c_start:]

 --output:--
 405
 1babcf
 fU

...or maybe this is easier to follow:

a = 1030405
b = 1babcf
c = fUUIUP
abc = 10304051babcffUUIUP
data = 4051babcffU

data_start = abc.find(data)
new_abc = abc[data_start:]
print new_abc
print data
print - * 10

--output:--
4051babcffUUIUP
4051babcffU
--

b_start = new_abc.find(b)
c_start = new_abc.find(c)

print data[:b_start]
print data[b_start:c_start]
print data[c_start:]

--output:--
405
1babcf
fU

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: efficiently splitting up strings based on substrings

2009-09-06 Thread 7stud

On Sep 6, 1:23 am, 7stud bbxx789_0...@yahoo.com wrote:
 On Sep 6, 1:14 am, 7stud bbxx789_0...@yahoo.com wrote:



  On Sep 5, 5:29 pm, per perfr...@gmail.com wrote:

   On Sep 5, 7:07 pm, Rhodri James rho...@wildebst.demon.co.uk wrote:

On Sat, 05 Sep 2009 23:54:08 +0100, per perfr...@gmail.com wrote:
 On Sep 5, 6:42 pm, Rhodri James rho...@wildebst.demon.co.uk wrote:
 On Sat, 05 Sep 2009 22:54:41 +0100, per perfr...@gmail.com wrote:
  I'm trying to efficiently split strings based on what substrings
  they are made up of.
  i have a set of strings that are comprised of known substrings.
  For example, a, b, and c are substrings that are not identical to 
  each
  other, e.g.:
  a = 0 * 5
  b = 1 * 5
  c = 2 * 5

  Then my_string might be:

  my_string = a + b + c

  i am looking for an efficient way to solve the following problem.
  suppose i have a short
  string x that is a substring of my_string.  I want to split the
  string x into blocks based on
  what substrings (i.e. a, b, or c) chunks of s fall into.

  to illustrate this, suppose x = 00111. Then I can detect where x
  starts in my_string
  using my_string.find(x).  But I don't know how to partition x into
  blocks depending
  on the substrings.  What I want to get out in this case is: 00,
  111.  If x were 00122,
  I'd want to get out 00,1, 22.

  is there an easy way to do this?  i can't simply split x on a, b, 
  or c
  because these might
  not be contained in x.  I want to avoid doing something inefficient
  like looking at all substrings
  of my_string etc.

  i wouldn't mind using regular expressions for this but i cannot 
  think
  of an easy regular
  expression for this problem.  I looked at the string module in the
  library but did not see
  anything that seemd related but i might have missed it.

 I'm not sure I understand your question exactly.  You seem to imply
 that the order of the substrings of x is consistent.  If that's the
 case, this ought to help:

  import re
  x = 00122
  m = re.match(r(0*)(1*)(2*), x)
  m.groups()

 ('00', '1', '22') y = 00111
  m = re.match(r(0*)(1*)(2*), y)
  m.groups()

 ('00', '111', '')

 You'll have to filter out the empty groups for yourself, but that's
 no great problem.

 The order of the substrings is consistent but what if it's not 0, 1, 2
 but a more complicated string? e.g.

 a = 1030405, b = 1babcf, c = fUUIUP

 then the substring x might be 4051ba, in which case using a regexp
 with (1*) will not work since both a and b substrings begin with the
 character 1.

Right.  This looks approximately nothing like what I thought your
problem was.  Would I be right in thinking that you want to match
substrings of your potential substrings against the string x?

I'm sufficiently confused that I think I'd like to see what your
use case actually is before I make more of a fool of myself.

--
Rhodri James *-* Wildebeest Herder to the Masses

   it's exactly the same problem, except there are no constraints on the
   strings.  so the problem is, like you say, matching the substrings
   against the string x. in other words, finding out where x aligns to
   the ordered substrings abc, and then determine what chunk of x belongs
   to a, what chunk belongs to b, and what chunk belongs to c.

   so in the example i gave above, the substrings are: a = 1030405, b =
   1babcf, c = fUUIUP, so abc = 10304051babcffUUIUP

   given a substring like 4051ba, i'd want to split it into the chunks a,
   b, and c. in this case, i'd want the result to be: [405, 1ba] --
   i.e. 405 is the chunk of x that belongs to a, and 1ba the chunk
   that belongs to be. in this case, there are no chunks of c.  if x
   instead were 4051babcffUU, the right output is: [405, 1babcf,
   fUU], which are the corresponding chunks of a, b, and c that make up
   x respectively.

   i'm not sure how to approach this. any ideas/tips would be greatly
   appreciated. thanks again.

  a = 1030405
  b = 1babcf
  c = fUUIUP
  abc = 10304051babcffUUIUP
  data = 4051babcffU

  data_start = abc.find(data)
  b_start = abc.find(b) - data_start
  c_start = abc.find(c) - data_start

  print data[:b_start]
  print data[b_start:c_start]
  print data[c_start:]

  --output:--
  405
  1babcf
  fU

 ...or maybe this is easier to follow:

 a = 1030405
 b = 1babcf
 c = fUUIUP
 abc = 10304051babcffUUIUP
 data = 4051babcffU

 data_start = abc.find(data)
 new_abc = abc[data_start:]
 print new_abc
 print data
 print - * 10

 --output:--
 4051babcffUUIUP
 4051babcffU
 --

 b_start = new_abc.find(b)
 c_start = new_abc.find(c)

 print data[:b_start]
 print data[b_start:c_start]
 print data[c_start:]

 --output:--
 405
 1babcf
 fU

Nope.  My solutions have problems with:

data = cffU

To handle that

Re: efficiently splitting up strings based on substrings

2009-09-05 Thread Rhodri James


On Sat, 05 Sep 2009 22:54:41 +0100, per perfr...@gmail.com wrote:


I'm trying to efficiently split strings based on what substrings
they are made up of.
i have a set of strings that are comprised of known substrings.
For example, a, b, and c are substrings that are not identical to each
other, e.g.:
a = 0 * 5
b = 1 * 5
c = 2 * 5

Then my_string might be:

my_string = a + b + c

i am looking for an efficient way to solve the following problem.
suppose i have a short
string x that is a substring of my_string.  I want to split the
string x into blocks based on
what substrings (i.e. a, b, or c) chunks of s fall into.

to illustrate this, suppose x = 00111. Then I can detect where x
starts in my_string
using my_string.find(x).  But I don't know how to partition x into
blocks depending
on the substrings.  What I want to get out in this case is: 00,
111.  If x were 00122,
I'd want to get out 00,1, 22.

is there an easy way to do this?  i can't simply split x on a, b, or c
because these might
not be contained in x.  I want to avoid doing something inefficient
like looking at all substrings
of my_string etc.

i wouldn't mind using regular expressions for this but i cannot think
of an easy regular
expression for this problem.  I looked at the string module in the
library but did not see
anything that seemd related but i might have missed it.


I'm not sure I understand your question exactly.  You seem to imply
that the order of the substrings of x is consistent.  If that's the
case, this ought to help:


import re
x = 00122
m = re.match(r(0*)(1*)(2*), x)
m.groups()

('00', '1', '22')

y = 00111
m = re.match(r(0*)(1*)(2*), y)
m.groups()

('00', '111', '')

You'll have to filter out the empty groups for yourself, but that's
no great problem.

--
Rhodri James *-* Wildebeest Herder to the Masses
--
http://mail.python.org/mailman/listinfo/python-list

Re: efficiently splitting up strings based on substrings

2009-09-05 Thread per

On Sep 5, 6:42 pm, Rhodri James rho...@wildebst.demon.co.uk wrote:
 On Sat, 05 Sep 2009 22:54:41 +0100, per perfr...@gmail.com wrote:
  I'm trying to efficiently split strings based on what substrings
  they are made up of.
  i have a set of strings that are comprised of known substrings.
  For example, a, b, and c are substrings that are not identical to each
  other, e.g.:
  a = 0 * 5
  b = 1 * 5
  c = 2 * 5

  Then my_string might be:

  my_string = a + b + c

  i am looking for an efficient way to solve the following problem.
  suppose i have a short
  string x that is a substring of my_string.  I want to split the
  string x into blocks based on
  what substrings (i.e. a, b, or c) chunks of s fall into.

  to illustrate this, suppose x = 00111. Then I can detect where x
  starts in my_string
  using my_string.find(x).  But I don't know how to partition x into
  blocks depending
  on the substrings.  What I want to get out in this case is: 00,
  111.  If x were 00122,
  I'd want to get out 00,1, 22.

  is there an easy way to do this?  i can't simply split x on a, b, or c
  because these might
  not be contained in x.  I want to avoid doing something inefficient
  like looking at all substrings
  of my_string etc.

  i wouldn't mind using regular expressions for this but i cannot think
  of an easy regular
  expression for this problem.  I looked at the string module in the
  library but did not see
  anything that seemd related but i might have missed it.

 I'm not sure I understand your question exactly.  You seem to imply
 that the order of the substrings of x is consistent.  If that's the
 case, this ought to help:

  import re
  x = 00122
  m = re.match(r(0*)(1*)(2*), x)
  m.groups()

 ('00', '1', '22') y = 00111
  m = re.match(r(0*)(1*)(2*), y)
  m.groups()

 ('00', '111', '')

 You'll have to filter out the empty groups for yourself, but that's
 no great problem.

 --
 Rhodri James *-* Wildebeest Herder to the Masses

The order of the substrings is consistent but what if it's not 0, 1, 2
but a more complicated string? e.g.

a = 1030405, b = 1babcf, c = fUUIUP

then the substring x might be 4051ba, in which case using a regexp
with (1*) will not work since both a and b substrings begin with the
character 1.

your solution works if that weren't a possibility, so what you wrote
is definitely the kind of solution i am looking for. i am just not
sure how to solve it in the general case where the substrings might be
similar to each other (but not similar enough that you can't tell
where the substring came from).

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: efficiently splitting up strings based on substrings

2009-09-05 Thread Rhodri James


On Sat, 05 Sep 2009 23:54:08 +0100, per perfr...@gmail.com wrote:


On Sep 5, 6:42 pm, Rhodri James rho...@wildebst.demon.co.uk wrote:

On Sat, 05 Sep 2009 22:54:41 +0100, per perfr...@gmail.com wrote:
 I'm trying to efficiently split strings based on what substrings
 they are made up of.
 i have a set of strings that are comprised of known substrings.
 For example, a, b, and c are substrings that are not identical to each
 other, e.g.:
 a = 0 * 5
 b = 1 * 5
 c = 2 * 5

 Then my_string might be:

 my_string = a + b + c

 i am looking for an efficient way to solve the following problem.
 suppose i have a short
 string x that is a substring of my_string.  I want to split the
 string x into blocks based on
 what substrings (i.e. a, b, or c) chunks of s fall into.

 to illustrate this, suppose x = 00111. Then I can detect where x
 starts in my_string
 using my_string.find(x).  But I don't know how to partition x into
 blocks depending
 on the substrings.  What I want to get out in this case is: 00,
 111.  If x were 00122,
 I'd want to get out 00,1, 22.

 is there an easy way to do this?  i can't simply split x on a, b, or c
 because these might
 not be contained in x.  I want to avoid doing something inefficient
 like looking at all substrings
 of my_string etc.

 i wouldn't mind using regular expressions for this but i cannot think
 of an easy regular
 expression for this problem.  I looked at the string module in the
 library but did not see
 anything that seemd related but i might have missed it.

I'm not sure I understand your question exactly.  You seem to imply
that the order of the substrings of x is consistent.  If that's the
case, this ought to help:

 import re
 x = 00122
 m = re.match(r(0*)(1*)(2*), x)
 m.groups()

('00', '1', '22') y = 00111
 m = re.match(r(0*)(1*)(2*), y)
 m.groups()

('00', '111', '')

You'll have to filter out the empty groups for yourself, but that's
no great problem.


The order of the substrings is consistent but what if it's not 0, 1, 2
but a more complicated string? e.g.

a = 1030405, b = 1babcf, c = fUUIUP

then the substring x might be 4051ba, in which case using a regexp
with (1*) will not work since both a and b substrings begin with the
character 1.


Right.  This looks approximately nothing like what I thought your
problem was.  Would I be right in thinking that you want to match
substrings of your potential substrings against the string x?

I'm sufficiently confused that I think I'd like to see what your
use case actually is before I make more of a fool of myself.

--
Rhodri James *-* Wildebeest Herder to the Masses
--
http://mail.python.org/mailman/listinfo/python-list

Re: efficiently splitting up strings based on substrings

2009-09-05 Thread per

On Sep 5, 7:07 pm, Rhodri James rho...@wildebst.demon.co.uk wrote:
 On Sat, 05 Sep 2009 23:54:08 +0100, per perfr...@gmail.com wrote:
  On Sep 5, 6:42 pm, Rhodri James rho...@wildebst.demon.co.uk wrote:
  On Sat, 05 Sep 2009 22:54:41 +0100, per perfr...@gmail.com wrote:
   I'm trying to efficiently split strings based on what substrings
   they are made up of.
   i have a set of strings that are comprised of known substrings.
   For example, a, b, and c are substrings that are not identical to each
   other, e.g.:
   a = 0 * 5
   b = 1 * 5
   c = 2 * 5

   Then my_string might be:

   my_string = a + b + c

   i am looking for an efficient way to solve the following problem.
   suppose i have a short
   string x that is a substring of my_string.  I want to split the
   string x into blocks based on
   what substrings (i.e. a, b, or c) chunks of s fall into.

   to illustrate this, suppose x = 00111. Then I can detect where x
   starts in my_string
   using my_string.find(x).  But I don't know how to partition x into
   blocks depending
   on the substrings.  What I want to get out in this case is: 00,
   111.  If x were 00122,
   I'd want to get out 00,1, 22.

   is there an easy way to do this?  i can't simply split x on a, b, or c
   because these might
   not be contained in x.  I want to avoid doing something inefficient
   like looking at all substrings
   of my_string etc.

   i wouldn't mind using regular expressions for this but i cannot think
   of an easy regular
   expression for this problem.  I looked at the string module in the
   library but did not see
   anything that seemd related but i might have missed it.

  I'm not sure I understand your question exactly.  You seem to imply
  that the order of the substrings of x is consistent.  If that's the
  case, this ought to help:

   import re
   x = 00122
   m = re.match(r(0*)(1*)(2*), x)
   m.groups()

  ('00', '1', '22') y = 00111
   m = re.match(r(0*)(1*)(2*), y)
   m.groups()

  ('00', '111', '')

  You'll have to filter out the empty groups for yourself, but that's
  no great problem.

  The order of the substrings is consistent but what if it's not 0, 1, 2
  but a more complicated string? e.g.

  a = 1030405, b = 1babcf, c = fUUIUP

  then the substring x might be 4051ba, in which case using a regexp
  with (1*) will not work since both a and b substrings begin with the
  character 1.

 Right.  This looks approximately nothing like what I thought your
 problem was.  Would I be right in thinking that you want to match
 substrings of your potential substrings against the string x?

 I'm sufficiently confused that I think I'd like to see what your
 use case actually is before I make more of a fool of myself.

 --
 Rhodri James *-* Wildebeest Herder to the Masses

it's exactly the same problem, except there are no constraints on the
strings.  so the problem is, like you say, matching the substrings
against the string x. in other words, finding out where x aligns to
the ordered substrings abc, and then determine what chunk of x belongs
to a, what chunk belongs to b, and what chunk belongs to c.

so in the example i gave above, the substrings are: a = 1030405, b =
1babcf, c = fUUIUP, so abc = 10304051babcffUUIUP

given a substring like 4051ba, i'd want to split it into the chunks a,
b, and c. in this case, i'd want the result to be: [405, 1ba] --
i.e. 405 is the chunk of x that belongs to a, and 1ba the chunk
that belongs to be. in this case, there are no chunks of c.  if x
instead were 4051babcffUU, the right output is: [405, 1babcf,
fUU], which are the corresponding chunks of a, b, and c that make up
x respectively.

i'm not sure how to approach this. any ideas/tips would be greatly
appreciated. thanks again.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: efficiently splitting up strings based on substrings

2009-09-05 Thread Rhodri James


On Sun, 06 Sep 2009 00:29:14 +0100, per perfr...@gmail.com wrote:


it's exactly the same problem, except there are no constraints on the
strings.  so the problem is, like you say, matching the substrings
against the string x. in other words, finding out where x aligns to
the ordered substrings abc, and then determine what chunk of x belongs
to a, what chunk belongs to b, and what chunk belongs to c.

so in the example i gave above, the substrings are: a = 1030405, b =
1babcf, c = fUUIUP, so abc = 10304051babcffUUIUP

given a substring like 4051ba, i'd want to split it into the chunks a,
b, and c. in this case, i'd want the result to be: [405, 1ba] --
i.e. 405 is the chunk of x that belongs to a, and 1ba the chunk
that belongs to be. in this case, there are no chunks of c.  if x
instead were 4051babcffUU, the right output is: [405, 1babcf,
fUU], which are the corresponding chunks of a, b, and c that make up
x respectively.

i'm not sure how to approach this. any ideas/tips would be greatly
appreciated. thanks again.


I see, I think.  Let me explain it back to you, just to be sure.

You have a string x, and three component strings a, b and c.  x is
a substring of the concatenation of a, b and c (i.e. a+b+c).  You
want to find out how x overlaps a, b and c.

Assuming I've understood this right, you're overthinking the problem.
All you need to do is find the start of x in a+b+c, then do some
calculations based on the string lengths and slice appropriately.
I'd scribble some example code, but it's nearly 1am and I'd be sure
to commit fence-post errors at this time of night.

--
Rhodri James *-* Wildebeest Herder to the Masses
--
http://mail.python.org/mailman/listinfo/python-list

Re: efficiently splitting up strings based on substrings

Re: efficiently splitting up strings based on substrings

Re: efficiently splitting up strings based on substrings

Re: efficiently splitting up strings based on substrings

Re: efficiently splitting up strings based on substrings

Re: efficiently splitting up strings based on substrings

Re: efficiently splitting up strings based on substrings

Re: efficiently splitting up strings based on substrings

8 matches

Site Navigation

Mail list logo

Footer information