>
> ---------- Forwarded message ----------
> From: denis
> Date: 18 November 2011 02:24
> Subject: sklearn pairwise_distances( sparse, sparse, l1 ) ?
> To: [email protected]
>
>
> Robert,
> could you take a look at the attached testcase for sklearn
> pairwise_distances( sparse, sparse, l1 ) ?
> I'd try fixing it myself, but
> 1) i don't understand the failing line
> D = np.abs(X[:, np.newaxis, :] - Y[np.newaxis, :, :])
> 2) i get grouchy every time i get near scipy.sparse.
>
> Thanks,
> cheers
> -- denis
>
I received this email about a bug with pairwise.manhattan_distances when
both x and y are sparse.
I have no idea what is going on with the error*, so I was unable to help.
Any thoughts?
* what does putting np.newaxis do here?
Thanks
- Robert
--
Public key at: http://pgp.mit.edu/ Search for this email address and select
the key from "2011-08-19" (key id: 54BA8735)
# from: pairwise-l1.py
# run: 17 Nov 2011 16:14 in ~bz/py/ml/sklearn mac 10.4.11 ppc
versions: numpy 1.6.0 scipy 0.9.0 sklearn 0.9 py 2.6.4 (r264:75706, Nov 3
2009, 13:13:00)
[GCC 4.0.1 (Apple Computer, Inc. build 5250)]
randomcsr: [ 0 14 30 41 72]
randomcsr: [ 9 18 34 39 53]
pairwise_distances l2: 1.19
cdist l1: 3.15
Traceback (most recent call last):
File "pairwise-l1.py", line 44, in <module>
d1 = pairwise_distances( x, y, metric="l1" )
File
"/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/sklearn/metrics/pairwise.py",
line 448, in pairwise_distances
return pairwise_distance_functions[metric](X, Y, **kwds)
File
"/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/sklearn/metrics/pairwise.py",
line 231, in manhattan_distances
D = np.abs(X[:, np.newaxis, :] - Y[np.newaxis, :, :])
File
"/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/sparse/csr.py",
line 220, in __getitem__
P = extractor(col,self.shape[1]).T #[1:2,[1,2]]
File
"/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/sparse/csr.py",
line 186, in extractor
indices = asindices(indices)
File
"/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/sparse/csr.py",
line 168, in asindices
raise IndexError('invalid index')
IndexError: invalid index
real 3 user 1 sys 0 cpu 82.01 %
# sklearn pairwise_distances( sparse, sparse, l1 ) ?
from __future__ import division
import sys
import numpy as np
from scipy import sparse # $scipy/sparse/csr.py
from scipy.spatial.distance import cdist # $slearn/metrics/pairwise.py
from sklearn.metrics import pairwise_distances
# $slearn/metrics/pairwise.py 232 ?
# D = np.abs(X[:, np.newaxis, :] - Y[np.newaxis, :, :])
__date__ = "2011-11-17 Nov"
__author_email__ = "denis-bz-py at t-online dot de"
import scipy, sklearn
print "versions: numpy %s scipy %s sklearn %s py %s" % (
np.__version__, scipy.__version__, sklearn.__version__, sys.version)
N = 100
density = .05
seed = 1
exec( "\n".join( sys.argv[1:] )) # run this.py N= ...
np.set_printoptions( 1, threshold=100, edgeitems=10, suppress=True )
np.random.seed(seed)
def randomcsr( N, density ):
""" random csr_matrix of 0 / 1 """
sample = np.random.random_sample( int( N * density ))
x = np.zeros(N)
x[(sample * N).astype(int)] = sample
xcsr = sparse.csr_matrix(x)
print "randomcsr:", xcsr.indices # sorted ?
return xcsr
x = randomcsr( N, density )
y = randomcsr( N, density )
d2 = pairwise_distances( x, y )
print "pairwise_distances l2: %.3g" % d2
d1 = cdist( x.todense(), y.todense(), metric="cityblock" )
print "cdist l1: %.3g" % d1
d1 = pairwise_distances( x, y, metric="l1" )
print "pairwise_distances l1: %.3g" % d1
# IndexError: invalid index
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general