Re: [Numpy-discussion] How to get Boolean matrix for similar lists in two different-size numpy arrays of lists

2021-03-14 Thread zoj613
The following seems to produce what you want using the data provided

```
In [31]: dF = np.genfromtxt('/home/F.csv', delimiter=',').tolist()

In [32]: dS = np.genfromtxt('/home/S.csv', delimiter=',').tolist()

In [33]: r =  [True if i in lS else False for i in dF]

In [34]: sum(r)

Out[34]: 300
```

I hope this helps.



--
Sent from: http://numpy-discussion.10968.n7.nabble.com/
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How to get Boolean matrix for similar lists in two different-size numpy arrays of lists

2021-03-14 Thread Andras Deak
On Sun, Mar 14, 2021 at 8:35 PM Robert Kern  wrote:
>
> On Sun, Mar 14, 2021 at 3:06 PM Ali Sheikholeslam 
>  wrote:
>>
>> I have written a question in:
>> https://stackoverflow.com/questions/66623145/how-to-get-boolean-matrix-for-similar-lists-in-two-different-size-numpy-arrays-o
>> It was recommended by numpy to send this subject to the mailing lists.
>>
>> The question is as follows. I would be appreciated if you could advise me to 
>> solve the problem:
>>
>> At first, I write a small example of to lists:
>>
>> F = [[1,2,3],[3,2,7],[4,4,1],[5,6,3],[1,3,7]]  # (1*5) 5 lists
>> S = [[1,3,7],[6,8,1],[3,2,7]]  # (1*3) 3 lists
>>
>> I want to get Boolean matrix for the same 'list's in two F and S:
>>
>> [False, True, False, False, True]  #  (1*5)5 
>> Booleans for 5 lists of F
>>
>> By using IM = reduce(np.in1d, (F, S)) it gives results for each number in 
>> each lists of F:
>>
>> [ True  True  True  True  True  True False False  True False  True  True
>>   True  True  True]   # (1*15)
>>
>> By using IM = reduce(np.isin, (F, S)) it gives results for each number in 
>> each lists of F, too, but in another shape:
>>
>> [[ True  True  True]
>>  [ True  True  True]
>>  [False False  True]
>>  [False  True  True]
>>  [ True  True  True]]   # (5*3)
>>
>> The true result will be achieved by code IM = [i in S for i in F] for the 
>> example lists, but when I'm using this code for my two main bigger numpy 
>> arrays of lists:
>>
>> https://drive.google.com/file/d/1YUUdqxRu__9-fhE1542xqei-rjB3HOxX/view?usp=sharing
>>
>> numpy array: 3036 lists
>>
>> https://drive.google.com/file/d/1FrggAa-JoxxoRqRs8NVV_F69DdVdiq_m/view?usp=sharing
>>
>> numpy array: 300 lists
>>
>> It gives wrong answer. For the main files it must give 3036 Boolean, in 
>> which 'True' is only 300 numbers. I didn't understand why this get wrong 
>> answers?? It seems it applied only on the 3rd characters in each lists of F. 
>> It is preferred to use reduce function by the two functions, np.in1d and 
>> np.isin, instead of the last method. How could to solve each of the three 
>> above methods??
>
>
> Thank you for providing the data. Can you show a complete, runnable code 
> sample that fails? There are several things that could go wrong here, and we 
> can't be sure which is which without the exact code that you ran.
>
> In general, you may well have problems with the floating point data that you 
> are not seeing with your integer examples.
>
> FWIW, I would continue to use something like the `IM = [i in S for i in F]` 
> list comprehension for data of this size.

Although somewhat off-topic for the numpy aspect, for completeness'
sake let me add that you'll probably want to first turn your list of
lists `S` into a set of tuples, and then look up each list in `F`
converted to a tuple (`[tuple(lst) in setified_S for lst in F]`). That
would probably be a lot faster for large lists.

AndrĂ¡s



You aren't getting any benefit trying to convert to arrays and using
our array set operations. They are written for 1D arrays of numbers,
not 2D arrays (attempting to treat them as 1D arrays of lists) and
won't really work on your data.
>
> --
> Robert Kern
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How to get Boolean matrix for similar lists in two different-size numpy arrays of lists

2021-03-14 Thread Robert Kern
On Sun, Mar 14, 2021 at 3:06 PM Ali Sheikholeslam <
sheikholeslam@gmail.com> wrote:

> I have written a question in:
>
> https://stackoverflow.com/questions/66623145/how-to-get-boolean-matrix-for-similar-lists-in-two-different-size-numpy-arrays-o
> It was recommended by numpy to send this subject to the mailing lists.
>
> The question is as follows. I would be appreciated if you could advise me
> to solve the problem:
>
> At first, I write a small example of to lists:
>
> F = [[1,2,3],[3,2,7],[4,4,1],[5,6,3],[1,3,7]]  # (1*5) 5 lists
> S = [[1,3,7],[6,8,1],[3,2,7]]  # (1*3) 3 lists
>
> I want to get Boolean matrix for the same 'list's in two F and S:
>
> [False, True, False, False, True]  #  (1*5)5 Booleans 
> for 5 lists of F
>
> By using IM = reduce(np.in1d, (F, S)) it gives results for each number in
> each lists of F:
>
> [ True  True  True  True  True  True False False  True False  True  True
>   True  True  True]   # (1*15)
>
> By using IM = reduce(np.isin, (F, S)) it gives results for each number in
> each lists of F, too, but in another shape:
>
> [[ True  True  True]
>  [ True  True  True]
>  [False False  True]
>  [False  True  True]
>  [ True  True  True]]   # (5*3)
>
> The true result will be achieved by code IM = [i in S for i in F] for the
> example lists, but when I'm using this code for my two main bigger numpy
> arrays of lists:
>
>
> https://drive.google.com/file/d/1YUUdqxRu__9-fhE1542xqei-rjB3HOxX/view?usp=sharing
>
> numpy array: 3036 lists
>
>
> https://drive.google.com/file/d/1FrggAa-JoxxoRqRs8NVV_F69DdVdiq_m/view?usp=sharing
>
> numpy array: 300 lists
>
> It gives wrong answer. For the main files it must give 3036 Boolean, in
> which 'True' is only 300 numbers. I didn't understand why this get wrong
> answers?? It seems it applied only on the 3rd characters in each lists of
> F. It is preferred to use reduce function by the two functions, np.in1d and
> np.isin, instead of the last method. How could to solve each of the three
> above methods??
>

Thank you for providing the data. Can you show a complete, runnable code
sample that fails? There are several things that could go wrong here, and
we can't be sure which is which without the exact code that you ran.

In general, you may well have problems with the floating point data that
you are not seeing with your integer examples.

FWIW, I would continue to use something like the `IM = [i in S for i in F]`
list comprehension for data of this size. You aren't getting any benefit
trying to convert to arrays and using our array set operations. They are
written for 1D arrays of numbers, not 2D arrays (attempting to treat them
as 1D arrays of lists) and won't really work on your data.

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] How to get Boolean matrix for similar lists in two different-size numpy arrays of lists

2021-03-14 Thread Ali Sheikholeslam
I have written a question in:
https://stackoverflow.com/questions/66623145/how-to-get-boolean-matrix-for-similar-lists-in-two-different-size-numpy-arrays-o
It was recommended by numpy to send this subject to the mailing lists.

The question is as follows. I would be appreciated if you could advise me
to solve the problem:

At first, I write a small example of to lists:

F = [[1,2,3],[3,2,7],[4,4,1],[5,6,3],[1,3,7]]  # (1*5) 5 lists
S = [[1,3,7],[6,8,1],[3,2,7]]  # (1*3) 3 lists

I want to get Boolean matrix for the same 'list's in two F and S:

[False, True, False, False, True]  #  (1*5)5
Booleans for 5 lists of F

By using IM = reduce(np.in1d, (F, S)) it gives results for each number in
each lists of F:

[ True  True  True  True  True  True False False  True False  True  True
  True  True  True]   # (1*15)

By using IM = reduce(np.isin, (F, S)) it gives results for each number in
each lists of F, too, but in another shape:

[[ True  True  True]
 [ True  True  True]
 [False False  True]
 [False  True  True]
 [ True  True  True]]   # (5*3)

The true result will be achieved by code IM = [i in S for i in F] for the
example lists, but when I'm using this code for my two main bigger numpy
arrays of lists:

https://drive.google.com/file/d/1YUUdqxRu__9-fhE1542xqei-rjB3HOxX/view?usp=sharing

numpy array: 3036 lists

https://drive.google.com/file/d/1FrggAa-JoxxoRqRs8NVV_F69DdVdiq_m/view?usp=sharing

numpy array: 300 lists

It gives wrong answer. For the main files it must give 3036 Boolean, in
which 'True' is only 300 numbers. I didn't understand why this get wrong
answers?? It seems it applied only on the 3rd characters in each lists of
F. It is preferred to use reduce function by the two functions, np.in1d and
np.isin, instead of the last method. How could to solve each of the three
above methods??
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion